Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668

Developed a deep-neural-network model for estimating blood pressure with a novel statistical feature selection method.
Achieved cutting-edge performance satisfying both AAMI and BHS standards for blood pressure measurement devices.
Cited by Mitsubishi Electric patent US20230063221A1 for cuffless BP estimation methodology.

GitHub

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)

Implemented self-supervised time series transformers, securing state-of-the-art performance on diverse temporal datasets.
Presented findings orally at the IEEE APSIPA ASC 2023 conference.

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report

Contributed to the GPU infrastructure and HPC systems that enabled training of the Amazon Nova family of foundation models.
Part of the High Performance Computing team that built Platform Leviathan for NVIDIA A100/H100 GPU management.

Work Experience

Project 1 — Cross-Cluster AI Training Platform (humanoid-robotics customer)

Architected Heterogeneous Compute Platform: abstracted heterogeneous compute (dedicated GPU + serverless CPU pools) across multiple Kubernetes clusters into a single resource substrate via recursive K8s-on-K8s virtualization (virtual-node-on-virtual-node); enabled Serverless-to-Reserved migration delivering 25x capacity scaling per availability zone at ~40% TCO reduction — the foundational compute layer all downstream SaaS depends on.
Engineered Cross-Cluster Identity Mesh: federated multiple Kubernetes clusters via application-layer routing + per-pod secrets-mount credential delivery; remote pods authenticate to customer VPC and private container registry transparently without static credentials.
Built AI Dev Workstations as a 0→1 SaaS product: pioneered dual-plane networking enabling simultaneous customer-VPC private egress AND public-internet developer ingress on the same pod — a capability the underlying cloud network model did not natively support; provisions in sub-60s p95 vs days of DIY VPC-peering + bastion setup.
Built Distributed Training & Simulation Scheduler as a 0→1 SaaS product: abstracted mixed CPU/GPU training pipelines into a clean job schema; per-component dispatch routes GPU-heavy workers to dedicated hardware and CPU-heavy workers to serverless infrastructure, shielding users from compute heterogeneity while delivering ~30-40% cost reduction on mixed workloads.

Project 2 — RCAgent: Trust-First Multi-Agent RCA System (sole author)

Architected Supervisor-Worker Multi-Agent System with 4-Gate Hallucination Defense: separated planning from execution via a supervisor dispatching to specialist sub-agents; new-skill additions gated by pass^3 ≥ 80% (Anthropic τ-bench consistency metric) on the regression suite — surfaces non-deterministic LLM failures missed by single-run tests.
Designed Meta-Tool with Hierarchical Skill Tree: agents perform multi-round retrieval over a 3-layer skill tree, narrowing the candidate set at each descent; yields avg 6 tools loaded per invocation out of 200+ available (~3% surface), invariant to catalog growth.

Company: Alibaba Cloud

Organization: AnalyticDB Org, AI Training Platform

Position: Infrastructure Software Engineer II (Official: SRE II)

Incumbency: Jul. 2025 - present

Platform Leviathan: NVIDIA A100/H100 Infrastructure for Amazon NOVA

GPU Lifecycle & Blocklisting: Architected scalable tracking system handling thousands of concurrent scaling requests; designed parent-child DAG in Apache Airflow; eliminated circular termination loops, delivering multi-million dollar annualized savings.
Automated Reliability Orchestrator: Designed fault identification workflow for large-scale GPU clusters using divide-and-conquer with dynamic K8s node labeling; reduced troubleshooting time by ~90% per incident.
API & Monitoring: Built customer-facing API/CLI (Lambda, Smithy); optimized DynamoDB O(N) → O(log N), improving query response latency by over 70%.

Company: Amazon.com LLC

Organization: AGI Org, High Performance Computing

Position: Software Development Engineer

Incumbency: Oct. 2023 - Jul. 2025

Comcom Website: Command Line Tools on The Web

Developed, packaged, and deployed a decoupled web application on AWS EC2, using Linux shell scripts and Docker images.
Built a multi-threaded backend with a SQL database and file-sharing system using Flask and Django with RESTful APIs.
Built the frontend with React.js and Node.js, integrating event, state, and proxy management.

Company: University of California San Diego

Organization: Computer Science Department

Position: Full-Stack Developer

Incumbency: Feb. 2023 - Jun. 2023

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

Created and launched a decoupled web application on AWS Lambda and Cloudfront for secure onboarding.
Developed Auto-Verification, Canaries, Access Control, and Monitoring modules using AWS-CDK/SDK, drastically reducing app/API integration time.
Refined the existing webUI to augment the self-service capabilities of the onboarding system.

Company: Amazon.com LLC

Organization: Alexa Org

Position: Software Dev Engineer Intern

Incumbency: Jun. 2022 - Sep. 2022

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

Established a prototype data pipeline for production line testing data analysis.
Developed three comprehensive Python packages for efficient data collection, alignment, and analysis.
Streamlined the scope of production line performance enhancement by approximately 66%.

Company: Wiwynn Inc (Acer's Child Company)

Position: Software Dev Engineer Intern

Incumbency: Jul. 2021 - Aug. 2021

Machine Learning and Biometric Recognition Lab

Deep Neural Network Predictor for Blood Pressure Estimation

Designed and implemented a deep learning model with a novel physiological feature selection algorithm.
Enhanced accuracy by ~1.8x and expanded data incorporation by ~6x, achieving MAE of 2.73 mmHg across 2.5M+ cardiac cycles.
Published this work in Sensors 20 international journal.

Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report

Work Experience

Project 1 — Cross-Cluster AI Training Platform (humanoid-robotics customer)

Project 2 — RCAgent: Trust-First Multi-Agent RCA System (sole author)

Company: Alibaba Cloud

Organization: AnalyticDB Org, AI Training Platform

Position: Infrastructure Software Engineer II (Official: SRE II)

Incumbency: Jul. 2025 - present

Platform Leviathan: NVIDIA A100/H100 Infrastructure for Amazon NOVA

Company: Amazon.com LLC

Organization: AGI Org, High Performance Computing

Position: Software Development Engineer

Incumbency: Oct. 2023 - Jul. 2025

Comcom Website: Command Line Tools on The Web

Company: University of California San Diego

Organization: Computer Science Department

Position: Full-Stack Developer

Incumbency: Feb. 2023 - Jun. 2023

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

Company: Amazon.com LLC

Organization: Alexa Org

Position: Software Dev Engineer Intern

Incumbency: Jun. 2022 - Sep. 2022

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

Company: Wiwynn Inc (Acer's Child Company)

Position: Software Dev Engineer Intern

Incumbency: Jul. 2021 - Aug. 2021

Deep Neural Network Predictor for Blood Pressure Estimation

Company: Machine Learning and Biometric Recognition Lab

Organization: National Central University

Position: Deep Learning Research Assistant

Incumbency: Dec. 2020 - Oct. 2021

Publications

Generalized Deep Neural Network Model for Cuffless Blood Pressure Estimation with Photoplethysmogram Signal Only

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.Sensors 20, no. 19: 5668

On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.APSIPA ASC 2023 (Oral Presentation)

The Amazon Nova Family of Models: Technical Report and Model Card

Amazon AGI et al. (including Hsu, Yan-Cheng)Amazon Technical Report

Work Experience

Project 1 — Cross-Cluster AI Training Platform (humanoid-robotics customer)

Project 2 — RCAgent: Trust-First Multi-Agent RCA System (sole author)

Company: Alibaba Cloud

Organization: AnalyticDB Org, AI Training Platform

Position: Infrastructure Software Engineer II (Official: SRE II)

Incumbency: Jul. 2025 - present

Platform Leviathan: NVIDIA A100/H100 Infrastructure for Amazon NOVA

Company: Amazon.com LLC

Organization: AGI Org, High Performance Computing

Position: Software Development Engineer

Incumbency: Oct. 2023 - Jul. 2025

Comcom Website: Command Line Tools on The Web

Company: University of California San Diego

Organization: Computer Science Department

Position: Full-Stack Developer

Incumbency: Feb. 2023 - Jun. 2023

Alexa Secure AI Platform: Sensai Self-Service Onboarding Platform

Company: Amazon.com LLC

Organization: Alexa Org

Position: Software Dev Engineer Intern

Incumbency: Jun. 2022 - Sep. 2022

Prometheus Infrastructure Testing Data Analysis and Software Toolkit Development

Company: Wiwynn Inc (Acer's Child Company)

Position: Software Dev Engineer Intern

Incumbency: Jul. 2021 - Aug. 2021

Deep Neural Network Predictor for Blood Pressure Estimation

Company: Machine Learning and Biometric Recognition Lab

Organization: National Central University

Position: Deep Learning Research Assistant

Incumbency: Dec. 2020 - Oct. 2021

Hsu, Yan-Cheng; Li, Yung-Hui; Chang, Ching-Chun; Harfiya, Latifa N.
Sensors 20, no. 19: 5668

Latifa N.; Hsu, Yan-Cheng; Li, Y.H.; Wang, J.C.
APSIPA ASC 2023 (Oral Presentation)

Amazon AGI et al. (including Hsu, Yan-Cheng)
Amazon Technical Report