Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668
  • Developed a deep-neural-network model for estimating blood pressure with a novel statistical feature selection method.
  • Achieved cutting-edge performance satisfying both AAMI and BHS standards for blood pressure measurement devices.
  • Cited by Mitsubishi Electric patent US20230063221A1 for cuffless BP estimation methodology.
GitHubGitHub
On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)
  • Implemented self-supervised time series transformers, securing state-of-the-art performance on diverse temporal datasets.
  • Presented findings orally at the IEEE APSIPA ASC 2023 conference.
The Amazon Nova Family of Models: Technical Report and Model Card

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report
  • Contributed to the GPU infrastructure and HPC systems that enabled training of the Amazon Nova family of foundation models.
  • Part of the High Performance Computing team that built Platform Leviathan for NVIDIA A100/H100 GPU management.

Work Experience

Alibaba Cloud

Project 1 — Cross-Cluster AI Training Platform (humanoid-robotics customer)

  • Architected Heterogeneous Compute Platform: abstracted heterogeneous compute (dedicated GPU + serverless CPU pools) across multiple Kubernetes clusters into a single resource substrate via recursive K8s-on-K8s virtualization (virtual-node-on-virtual-node); enabled Serverless-to-Reserved migration delivering 25x capacity scaling per availability zone at ~40% TCO reduction — the foundational compute layer all downstream SaaS depends on.
  • Engineered Cross-Cluster Identity Mesh: federated multiple Kubernetes clusters via application-layer routing + per-pod secrets-mount credential delivery; remote pods authenticate to customer VPC and private container registry transparently without static credentials.
  • Built AI Dev Workstations as a 0→1 SaaS product: pioneered dual-plane networking enabling simultaneous customer-VPC private egress AND public-internet developer ingress on the same pod — a capability the underlying cloud network model did not natively support; provisions in sub-60s p95 vs days of DIY VPC-peering + bastion setup.
  • Built Distributed Training & Simulation Scheduler as a 0→1 SaaS product: abstracted mixed CPU/GPU training pipelines into a clean job schema; per-component dispatch routes GPU-heavy workers to dedicated hardware and CPU-heavy workers to serverless infrastructure, shielding users from compute heterogeneity while delivering ~30-40% cost reduction on mixed workloads.

Project 2 — RCAgent: Trust-First Multi-Agent RCA System (sole author)

  • Architected Supervisor-Worker Multi-Agent System with 4-Gate Hallucination Defense: separated planning from execution via a supervisor dispatching to specialist sub-agents; new-skill additions gated by pass^3 ≥ 80% (Anthropic τ-bench consistency metric) on the regression suite — surfaces non-deterministic LLM failures missed by single-run tests.
  • Designed Meta-Tool with Hierarchical Skill Tree: agents perform multi-round retrieval over a 3-layer skill tree, narrowing the candidate set at each descent; yields avg 6 tools loaded per invocation out of 200+ available (~3% surface), invariant to catalog growth.
Company: Alibaba Cloud
Organization: AnalyticDB Org, AI Training Platform
Position: Infrastructure Software Engineer II (Official: SRE II)
Incumbency: Jul. 2025 - present
Amazon.com LLC

Platform Leviathan: NVIDIA A100/H100 Infrastructure for Amazon NOVA

  • GPU Lifecycle & Blocklisting: Architected scalable tracking system handling thousands of concurrent scaling requests; designed parent-child DAG in Apache Airflow; eliminated circular termination loops, delivering multi-million dollar annualized savings.
  • Automated Reliability Orchestrator: Designed fault identification workflow for large-scale GPU clusters using divide-and-conquer with dynamic K8s node labeling; reduced troubleshooting time by ~90% per incident.
  • API & Monitoring: Built customer-facing API/CLI (Lambda, Smithy); optimized DynamoDB O(N) → O(log N), improving query response latency by over 70%.
Company: Amazon.com LLC
Organization: AGI Org, High Performance Computing
Position: Software Development Engineer
Incumbency: Oct. 2023 - Jul. 2025
University of California San Diego

Comcom Website: Command Line Tools on The Web

  • Developed, packaged, and deployed a decoupled web application on AWS EC2, using Linux shell scripts and Docker images.
  • Built a multi-threaded backend with a SQL database and file-sharing system using Flask and Django with RESTful APIs.
  • Built the frontend with React.js and Node.js, integrating event, state, and proxy management.
Company: University of California San Diego
Organization: Computer Science Department
Position: Full-Stack Developer
Incumbency: Feb. 2023 - Jun. 2023
Amazon.com LLC

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

  • Created and launched a decoupled web application on AWS Lambda and Cloudfront for secure onboarding.
  • Developed Auto-Verification, Canaries, Access Control, and Monitoring modules using AWS-CDK/SDK, drastically reducing app/API integration time.
  • Refined the existing webUI to augment the self-service capabilities of the onboarding system.
Company: Amazon.com LLC
Organization: Alexa Org
Position: Software Dev Engineer Intern
Incumbency: Jun. 2022 - Sep. 2022
Wiwynn Inc (Acer's Child Company)

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

  • Established a prototype data pipeline for production line testing data analysis.
  • Developed three comprehensive Python packages for efficient data collection, alignment, and analysis.
  • Streamlined the scope of production line performance enhancement by approximately 66%.
Company: Wiwynn Inc (Acer's Child Company)
Position: Software Dev Engineer Intern
Incumbency: Jul. 2021 - Aug. 2021
Machine Learning and Biometric Recognition Lab

Deep Neural Network Predictor for Blood Pressure Estimation

  • Designed and implemented a deep learning model with a novel physiological feature selection algorithm.
  • Enhanced accuracy by ~1.8x and expanded data incorporation by ~6x, achieving MAE of 2.73 mmHg across 2.5M+ cardiac cycles.
  • Published this work in Sensors 20 international journal.
Company: Machine Learning and Biometric Recognition Lab
Organization: National Central University
Position: Deep Learning Research Assistant
Incumbency: Dec. 2020 - Oct. 2021