Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668
  • Developed a deep-neural-network model for estimating blood pressure with a novel statistical feature selection method.
  • Achieved cutting-edge performance satisfying both AAMI and BHS standards for blood pressure measurement devices.
GitHubGitHub
On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)
  • Implemented self-supervised time series transformers, securing state-of-the-art performance on diverse temporal datasets.
  • Presented findings orally at the IEEE APSIPA ASC 2023 conference.
The Amazon Nova Family of Models: Technical Report and Model Card

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report
  • Contributed to the GPU infrastructure and HPC systems that enabled training of the Amazon Nova family of foundation models.
  • Part of the High Performance Computing team that built Platform Leviathan for NVIDIA A100/H100 GPU management.

Work Experience

Alibaba Cloud

Cross-Cluster AI Training Infrastructure for Unitree G1-D

  • Designed Dual-Layer Virtual Kubelet architecture orchestrating massive-scale heterogeneous GPU fleet; enabled Serverless-to-Reserved migration achieving ~40% TCO reduction and 25x capacity scaling for Unitree G1 humanoid robots.
  • Engineered Federated Identity Mesh enabling cross-cluster AuthN/AuthZ via external Secret injection; resolved split-brain identity challenges with Custom Controllers implementing automated token rotation.
  • Automated VPC provisioning via Terraform (Security Groups, CLB ACLs) to bypass CoreDNS isolation; rearchitected Ray service discovery for low-latency cross-boundary communication.
  • Built Kubernetes Controller with telemetry pipeline (DCGM metrics, kernel traces); implemented reconciliation loop triggering automated cordon/drain for faulty GPU nodes.
Company: Alibaba Cloud
Organization: AnalyticDB Org, AI Training Platform
Position: Infrastructure Software Engineer II (Official: SRE II)
Incumbency: Jul. 2025 - present
Amazon.com LLC

Platform Leviathan: NVIDIA A100/H100 Infrastructure for Amazon NOVA

  • GPU Lifecycle & Blocklisting: Architected scalable tracking system handling thousands of concurrent scaling requests; designed parent-child DAG in Apache Airflow; eliminated circular termination loops, delivering multi-million dollar annualized savings.
  • Automated Reliability Orchestrator: Designed fault identification workflow for large-scale GPU clusters using divide-and-conquer with dynamic K8s node labeling; reduced troubleshooting time by ~90% per incident.
  • API & Monitoring: Built customer-facing API/CLI (Lambda, Smithy); optimized DynamoDB O(N) → O(log N), improving query response latency by over 70%.
Company: Amazon.com LLC
Organization: AGI Org, High Performance Computing
Position: Software Development Engineer
Incumbency: Oct. 2023 - Jul. 2025
University of California San Diego

Comcom Website: Command Line Tools on The Web

  • Developed, packaged, and deployed a decoupled web application on AWS EC2, using Linux shell scripts and Docker images.
  • Built a multi-threaded backend with a SQL database and file-sharing system using Flask and Django with RESTful APIs.
  • Built the frontend with React.js and Node.js, integrating event, state, and proxy management.
Company: University of California San Diego
Organization: Computer Science Department
Position: Full-Stack Developer
Incumbency: Feb. 2023 - Jun. 2023
Amazon.com LLC

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

  • Created and launched a decoupled web application on AWS Lambda and Cloudfront for secure onboarding.
  • Developed Auto-Verification, Canaries, Access Control, and Monitoring modules using AWS-CDK/SDK, drastically reducing app/API integration time.
  • Refined the existing webUI to augment the self-service capabilities of the onboarding system.
Company: Amazon.com LLC
Organization: Alexa Org
Position: Software Dev Engineer Intern
Incumbency: Jun. 2022 - Sep. 2022
Wiwynn Inc (Acer's Child Company)

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

  • Established a prototype data pipeline for production line testing data analysis.
  • Developed three comprehensive Python packages for efficient data collection, alignment, and analysis.
  • Streamlined the scope of production line performance enhancement by approximately 66%.
Company: Wiwynn Inc (Acer's Child Company)
Position: Software Dev Engineer Intern
Incumbency: Jul. 2021 - Aug. 2021
Machine Learning and Biometric Recognition Lab

Deep Neural Network Predictor for Blood Pressure Estimation

  • Designed and implemented a deep learning model with a novel physiological feature selection algorithm.
  • Enhanced accuracy by ~1.8x and expanded data incorporation by ~6x, achieving MAE of 2.73 mmHg across 2.5M+ cardiac cycles.
  • Published this work in Sensors 20 international journal.
Company: Machine Learning and Biometric Recognition Lab
Organization: National Central University
Position: Deep Learning Research Assistant
Incumbency: Dec. 2020 - Oct. 2021