Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668
  • Developed a deep-neural-network model for estimating blood pressure with a novel statistical feature selection method.
  • Achieved cutting-edge performance satisfying both AAMI and BHS standards for blood pressure measurement devices.
GitHubGitHub
On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)
  • Implemented self-supervised time series transformers, securing state-of-the-art performance on diverse temporal datasets.
  • Presented findings orally at the IEEE APSIPA ASC 2023 conference.
The Amazon Nova Family of Models: Technical Report and Model Card

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report
  • Contributed to the GPU infrastructure and HPC systems that enabled training of the Amazon Nova family of foundation models.
  • Part of the High Performance Computing team that built Platform Leviathan for NVIDIA A100/H100 GPU management.

Work Experience

Alibaba Cloud

Project Nexus: Cross-Cluster AI Training Infrastructure for Unitree G1-D

  • Designed the Dual-Layer Virtual Kubelet architecture to centralize 10,000+ heterogeneous GPUs, achieving ~40% cost reduction and scaling training capacity by 25x.
  • Implemented Federated Identity Mesh for secure Cross-Cluster AuthN/AuthZ with 9-hour token rotation.
  • Automated VPC networking via Terraform and re-architected Ray's service discovery for low-latency communication.

AIOps Hybrid Runtime Observability & Self-Healing

  • Architecting telemetry pipeline with SysOM agents to tunnel DCGM metrics from isolated GPU sandboxes to SLS/OSS sinks.
  • Developing custom Kubernetes Controller for autonomous self-healing workflows (cordon/drain) for unhealthy GPU nodes.
Company: Alibaba Cloud
Organization: AnalyticDB Org - AI Training Platform Resource Management
Position: Site Reliability Engineer II
Incumbency: Jul. 2025 - present
Amazon.com LLC

Platform Leviathan: NVIDIA A100s/H100s Infrastructure for Amazon NOVA

  • Architected a scalable GPU tracking system handling 3,000+ scaling requests, projecting to save $1.5 million annually.
  • Designed "Bad GPU" identification workflow for 7,000+ GPUs, reducing troubleshooting time by 90% and saving ~100 engineer-hours per month.
  • Re-engineered DynamoDB access patterns improving API response times by 70% with 100% failure detection rate.
Company: Amazon.com LLC
Organization: AGI Org - High Performance Computing
Position: Software Development Engineer
Incumbency: Oct. 2023 - Jul. 2025
University of California San Diego

Comcom Website: Command Line Tools on The Web

  • Developed, packaged, and deployed a decoupled web application on AWS EC2, using Linux shell scripts and Docker images.
  • Built a multi-threaded backend with a SQL database and file-sharing system using Flask and Django with RESTful APIs.
  • Built the frontend with React.js and Node.js, integrating event, state, and proxy management.
Company: University of California San Diego
Organization: Computer Science Department
Position: Full-Stack Developer
Incumbency: Feb. 2023 - Jun. 2023
Amazon.com LLC

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

  • Created and launched a decoupled web application on AWS Lambda and Cloudfront for secure onboarding.
  • Developed Auto-Verification, Canaries, Access Control, and Monitoring modules using AWS-CDK/SDK, reducing app/API integration time from 4 hours to 15 minutes - a 90% reduction.
  • Refined the existing webUI to augment the self-service capabilities of the onboarding system.
Company: Amazon.com LLC
Organization: Alexa Org
Position: Software Dev Engineer Intern
Incumbency: Jun. 2022 - Sep. 2022
Wiwynn Inc (Acer's Child Company)

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

  • Established a prototype data pipeline for production line testing data analysis.
  • Developed three comprehensive Python packages for efficient data collection, alignment, and analysis.
  • Streamlined the scope of production line performance enhancement by approximately 66%.
Company: Wiwynn Inc (Acer's Child Company)
Position: Software Dev Engineer Intern
Incumbency: Jul. 2021 - Aug. 2021
Machine Learning and Biometric Recognition Lab

Deep Neural Network Predictor for Blood Pressure Estimation

  • Designed and implemented a deep learning model with a novel physiological feature selection algorithm.
  • Enhanced accuracy by ~1.8x and expanded data incorporation by ~6x, achieving MAE of 2.73 mmHg across 2.5M+ cardiac cycles.
  • Published this work in Sensors 20 international journal.
Company: Machine Learning and Biometric Recognition Lab
Organization: National Central University
Position: Deep Learning Research Assistant
Incumbency: Dec. 2020 - Oct. 2021