Qingyu Zhou
Summary
- 10+ years of industry experience with back-end, infrastructure and application development
- Excel in Cloud Solutions based on Amazon Web Service and on-premise datacenters
- Familiar with information retrieval & observability infrastructure
- Build, maintain, and scale large infrastructure fleets and distributed clusters
Contact
16322 NE 12th Pl
Bellevue, WA 98008
zhouqingyu.rex@gmail.com
858-729-4674
Education
M.S Computer Science Sept. 2015 — Dec. 2016
University of California, San DiegoPOSITION
Software Engineer Nov. 2024— Current
Amazon, Amazon.com Services LLC, Alexa - Conversational and Learning Team, Bellevue- Worked on enhancing prompt construction service, reducing latency by optimizing different iteration calls
- Simplified prompt context onboarding and assembly workflows to reduce integration complexity
- Extended the agent architecture from a single-agent model to a multi-agent system to support coordinated task planning and execution
- Improved observability and tracing across multi-agent workflows to ensure end-to-end task traceability
AWS Engineering Manager May. 2022— Nov. 2024
Amazon, Amazon Web Services - External Security Services, Seattle- Managing and leading the AWS GuardDuty Rainier team, part of Control Plane teams
- Focusing on service metering & usage, security findings storage/decoration/publishing and public APIs
- Worked on the security features including GuardDuty Malware, RDS, Lambda and Container Protection
- Expanded the service into new regions: UAE and Zurich
- Ensured the service is meeting compliance, passing security reviews and penetration tests
- Conducting monthly OLR, weekly 1:1, defining operating plan and participating escalation oncall rotations
TuSimple Tech Lead Manager, Senior Software Engineer II Jun. 2020— May. 2022
TuSimple, Site Reliability Engineering, San Diego- Leading and managing the Site Reliability Engineering Team and growed the team from one to seven members
- Handled the OKR/budget planning, hiring & job posting, project management and engineer calibration
- Worked on Traffic Engineering for L4/L7 load balancing (kube-vip & seesaw/ nginx & haproxy) (On prem & aws)
- Developed the observability (logging, metrics), monitoring framework (cadence) and alerting stack (grafana/pagerduty)
- Maintained the deployment platform (rancher) and the data plane of the ML Platforms (k8s & calico & volcano)
- Supported the fuse (goofys) for the dataset, the NFS storage system (weka/DDN) and the streaming platform (Kafka)
- Deployed vendor solutions: artifactory, github enterprise, x-ray, notary
- Migrated SRE python and golang repos to mono repos(bazel)
- Built the service catalog (backstage), inventory management (AWS SSM) and some other internal tools
Uber Software Engineer April. 2019— Jun. 2020
Uber, Observability Log Search Team, Palo Alto & NYC- Built the query layer, Lucene translator, for the next generation logging platform
- Working on the storage layer of the new platform based on Clickhouse, etcd, zookeeper
- Operating the existing ELK logging (storage & ingestion) stack
- Improved the ingestion performance of Logstash pipelines
AWS Software Engineer FEB. 2017 — April.2019
Amazon, AWS Search Services, Palo Alto- Worked in AWS Search services team, focusing on AWS CloudSearch and Elasticsearch Service
- Migrated Search Services infrastructure to AWS Cloudformation as infrastructure as code project
- Expanded Elasticsearch Services to new regions, including London, Paris, Beijing, Ningxia, Sweden
- Implemented Elasticsearch Node-to-node Encryption feature, based on AWS ELB/ACM/KMS and HAProxy
- Leveraged day-to-day experiences to troubleshoot customer issues and maintained 100k+ EC2 instances