Ergomotion IOR — Final Solution Project Plan

01 — Current State

What You Have Today

A FastAPI + React monolith that loads static CSV files at startup, processes uploaded SAP exports in-memory with Pandas, and outputs color-coded 192-column Excel files for customs brokers. Auth is JWT + bcrypt with a JSON file user store.

⚛

Frontend — React 18 + Vite

TypeScript, plain CSS, Lucide icons

Drag-and-drop CSV/Excel upload
Validation dashboard (input, customs lines, addition tab)
User management admin panel
Auth context with cookie-based JWT (currently bypassed)
No router — conditional views via state flags

⚡

Backend — FastAPI + Pandas

Python 3.11, Uvicorn, openpyxl

4 CSV master files loaded at startup (LRU cached)
3-component decomposition (NonSteel/Aluminum, Steel, Aluminum)
HTS code lookup, tariff calculation, duty computation
192-column Excel output with dual tabs + color coding
Audit trail with CALC-ID, SHA-256 hash, geolocation

🔐

Auth — Custom JWT

bcrypt, users.json, admin/operator roles

JWT (HS256) with bcrypt password hashing
User store: flat JSON file on disk
Role-based: admin and operator
Password change, reset, forced change flows
Testing mode bypasses real auth

📦

Data — Flat CSV Files

~75 products, on-disk, manually updated

Product_List.csv — materials, weights, COO, HTS codes
HTS Tariff.csv — category/COO tariff mappings
HTS_Code.csv — rate lookup with validity dates
Packaging_Material.csv — packaging SAP numbers
Audit stored locally or in S3 (Object Lock)

🚀

Deployment

Railway, Render, Lightsail configs

render.yaml, railway.json, deploy_lightsail.sh
No Docker — uses platform buildpacks
No CI/CD pipeline
Environment vars for CORS, JWT secret, S3 keys

📝

Audit Trail

CALC-ID, 7-year retention, S3 Object Lock

Unique ID: CALC-YYYYMMDD-HHMMSS-XXXX
File SHA-256 hash, row counts, tariff rates used
Server IP geolocation, reconciliation check
S3 Object Lock (Compliance mode, 2555 days)
Indexed by PO number and filename

02 — Gap Analysis

Current vs. Final State

Every dimension that needs to change to reach the target architecture. Green cells show what the final state delivers over the current red state.

Dimension	Current State	Final State
Data Source	Gap Static CSVs on disk, manually updated	Target SAP China sends JSON daily to S3
Data Storage	Gap Flat files on app server	Target S3 (raw + processed Parquet) + DynamoDB lookups
Data Pipeline	Gap None — loaded at app startup	Target AWS Glue ETL, daily scheduled PySpark transforms
Fuzzy Matching	Gap None — exact material match only	Target PySpark fuzzy matching for customer name discrepancies
Record Volume	Gap ~75 products	Target 6,000–10,000 records per daily batch
Authentication	Gap Custom JWT + bcrypt + users.json	Target Microsoft Entra ID SSO (M365 app launcher)
Hosting	Gap Railway / Render / Lightsail	Target AWS (Lambda, S3, DynamoDB, Glue, CloudFront)
Cold Storage	Gap None	Target S3 Glacier lifecycle (Instant → Flexible → Deep Archive)
Monitoring	Gap Console logs only	Target CloudWatch dashboards, SNS alerts, WAF

What Carries Forward (No Rebuild Needed)

✓

processor.py

3-component decomposition, steel/aluminum calcs, 192-column output. Moves to Lambda nearly unchanged.

✓

validator.py

Input/output validation, cross-reference checks. Direct port to Lambda.

✓

audit_trail.py

CALC-ID generation, metadata structure. Minor update for Entra ID user fields.

✓

s3_audit_storage.py

Already built for S3 with Object Lock. Reuse directly in Lambda.

✓

Frontend UI

Upload flow, validation tables, admin panel, styling. Only auth layer and API URLs change.

✓

Excel Generation

openpyxl color-coded output with dual tabs (production + audit). Runs in Lambda the same way.

03 — Final Architecture

AWS Target Architecture

Five layers from SAP ingestion through to the user-facing application and 7-year audit archival.

External — SAP China

Daily JSON Push (6K–10K records)

Product Data JSON HTS Codes JSON Tariff Rates JSON

Layer 1 — Ingestion

S3 Raw Landing Zone + Lambda Validator

S3 Raw Bucket Lambda Validator SNS Alerts Schema Check Dedup Detection

Layer 2 — ETL & Transformation

AWS Glue (PySpark) — Daily at 02:00 UTC

Clean & Normalize Fuzzy Match (Levenshtein) Quality Checks Cross-Reference Parquet Export

Layer 3 — Processed Data

S3 Parquet + DynamoDB Lookup Tables

S3 Parquet (partitioned) DDB: Product Master DDB: HTS Codes DDB: Tariff Categories DDB: Customer Master DDB: PO Reference

Layer 4 — Application

React (S3 + CloudFront) → API Gateway → Lambda

CloudFront CDN Entra ID SSO (MSAL.js) API Gateway Lambda: Process Lambda: Download Lambda: Audit

Layer 5 — Audit & Archival

S3 Object Lock + Glacier Lifecycle (7 Years)

0–90 days: S3 Standard 90–365 days: Glacier Instant 1–3 yrs: Glacier Flexible 3–7 yrs: Deep Archive CloudWatch + WAF + KMS

04 — Timeline

14-Week Implementation Gantt

Phases overlap where dependencies allow. The critical path runs SAP schema → Glue ETL → DynamoDB → Lambda → API Gateway → Frontend.

Phase W1W2W3W4W5W6W7 W8W9W10W11W12W13W14

Phase 0AWS Foundation

Infrastructure

Phase 1Data Pipeline

SAP → S3 → Glue → DynamoDB

Phase 2App Migration

FastAPI → Lambda

Phase 3Entra ID SSO

Microsoft 365 Auth

Phase 4Frontend Deploy

S3 + CloudFront

Phase 5Hardening

Go-Live

Critical Path Dependencies

Wk 3 SAP JSON schema must be agreed before Glue ETL can be built

Wk 7 DynamoDB tables must be populated before Lambda functions can be tested

Wk 8 Entra ID app registration must be done by IT admin before MSAL work starts

Wk 10 API Gateway must be live before frontend can switch from direct FastAPI calls

05 — Phase Details

Task Breakdown by Phase

Click each phase to expand the full task list with deliverables.

P0

Foundation & Infrastructure

Weeks 1–3 · DevOps Engineer · 7 tasks

ID	Task	Deliverable
0.1	Set up AWS Organization, accounts (dev/staging/prod)	AWS account structure
0.2	Terraform / CloudFormation IaC repository	Infrastructure as Code repo
0.3	Create S3 buckets (raw, processed, audit, frontend) with policies	Buckets with versioning, encryption
0.4	Create DynamoDB tables with GSIs	5 tables: product, HTS, tariff, customer, PO
0.5	Set up IAM roles and policies (Glue, Lambda, S3)	Least-privilege IAM
0.6	Set up CloudWatch log groups, SNS topics	Monitoring baseline
0.7	Set up CI/CD pipeline (GitHub Actions or CodePipeline)	Automated deploy pipeline

P1

Data Pipeline — SAP to S3 to DynamoDB

Weeks 3–7 · Data Engineer + SAP Team · 10 tasks

ID	Task	Deliverable
1.1	Define JSON schema contracts with SAP team (products, HTS, tariffs)	Schema documentation
1.2	SAP team builds daily JSON export to S3 raw landing zone	SAP integration endpoint
1.3	Build Lambda ingestion validator (schema check, dedup, trigger Glue)	Lambda function
1.4	Build AWS Glue ETL job (PySpark): clean, normalize, validate 6K–10K records	Glue job
1.5	Add fuzzy matching for customer name discrepancies (python-Levenshtein)	Fuzzy match module in Glue
1.6	Add quality checks (missing HTS, zero weights, invalid COO, cross-ref)	Quality report output
1.7	Write Parquet output to S3 processed zone (partitioned by date)	Parquet files in S3
1.8	Populate DynamoDB lookup tables from processed Parquet	DynamoDB populated
1.9	Build data quality dashboard / reporting	Quality monitoring
1.10	Backfill: migrate current CSV master data into DynamoDB as baseline	Initial data load verified

P2

Application Migration — FastAPI to Lambda

Weeks 6–10 · Backend Engineer · 8 tasks

ID	Task	Deliverable
2.1	Refactor data_loader.py: replace CSV reads with DynamoDB queries	DynamoDB data layer
2.2	Refactor processor.py: keep logic, swap data source to DynamoDB	Processor using DynamoDB
2.3	Package processing code as Lambda function (API Gateway trigger)	Lambda: process
2.4	Package Excel download as Lambda (presigned S3 URL for output)	Lambda: download
2.5	Package audit storage as Lambda layer (reuse existing S3 code)	Lambda: audit
2.6	Set up API Gateway with routes matching current API surface	API Gateway configured
2.7	Implement S3 lifecycle policies for audit (Standard → Glacier tiers)	Lifecycle rules active
2.8	Integration testing: end-to-end with DynamoDB data	Test suite passing

P3

Authentication — Microsoft Entra ID SSO

Weeks 8–11 · Frontend + Backend Engineer · 8 tasks

ID	Task	Deliverable
3.1	Register app in Microsoft Entra ID (Client ID, Tenant ID, redirect URIs)	App registration
3.2	Configure OAuth 2.0 scopes, app roles (admin, operator)	RBAC configuration
3.3	Frontend: Replace AuthContext with MSAL.js (@azure/msal-react)	SSO login flow
3.4	Lambda: Add JWT validation middleware (Microsoft-issued tokens)	Token validation
3.5	Map Entra ID roles to existing admin/operator roles	Role mapping tested
3.6	Update audit trail to capture Entra ID user identity	Audit user fields updated
3.7	Configure M365 app launcher tile	App in waffle menu
3.8	Remove old auth/ module (JWT + bcrypt + users.json)	Dead code removed

P4

Frontend Deployment & Polish

Weeks 10–12 · Frontend Engineer · 5 tasks

ID	Task	Deliverable
4.1	Deploy React build to S3 + CloudFront (HTTPS, custom domain)	Frontend hosted on AWS
4.2	Update API calls to point to API Gateway endpoint	API integration verified
4.3	Add data quality alerts in UI (ETL quality report flags)	Quality indicators
4.4	Add data freshness indicator ("Last SAP sync: 2h ago")	Freshness badge in header
4.5	E2E testing with real SAP data through full pipeline	UAT complete

P5

Hardening & Go-Live

Weeks 12–14 · All Team · 6 tasks

ID	Task	Deliverable
5.1	Load testing (6K–10K records through full pipeline)	Performance baseline documented
5.2	Security review (IAM, WAF, encryption, Entra ID audit)	Security sign-off
5.3	Disaster recovery testing (S3 cross-region replication)	DR plan validated
5.4	Runbook documentation (operations, troubleshooting, escalation)	Ops documentation
5.5	Staged rollout (pilot users → full team)	Production go-live
5.6	Decommission old Railway/Render/Lightsail deployment	Old infra shut down

06 — Team Composition

People & Skills Required

5–7 people across 5 roles. Some roles can overlap depending on team size and budget.

☁️

AWS Cloud / DevOps Engineer

1 person · Full-time · Phases 0–5 (14 weeks)

Key Responsibilities

AWS accounts, VPC, IAM, S3, DynamoDB, API Gateway, CloudFront
Terraform / CloudFormation infrastructure as code
CI/CD pipelines (GitHub Actions or CodePipeline)
CloudWatch monitoring, SNS alerts, WAF rules
S3 lifecycle policies (Standard → Glacier tiers)
KMS encryption, Object Lock for audit compliance

Required Skills

AWS Certified Terraform S3 DynamoDB Lambda API Gateway CloudFront IAM CloudWatch CI/CD Docker

🔧

Data / ETL Engineer

1–2 people · Full-time → Part-time · Phases 1–2

Key Responsibilities

Define JSON schema contracts with SAP China team
Build AWS Glue ETL pipeline (PySpark) for 6K–10K records
Implement fuzzy matching (python-Levenshtein on Spark)
Build data quality checks and cross-reference validation
Write Parquet to S3 processed zone, populate DynamoDB
Backfill migration of current CSV data into DynamoDB

Required Skills

Python PySpark Pandas AWS Glue S3 DynamoDB Parquet Fuzzy Matching JSON Schema SQL / Athena

⚙️

Backend / Application Engineer

1–2 people · Full-time · Phases 2–4 (8 weeks)

Key Responsibilities

Refactor data_loader.py to read from DynamoDB instead of CSV
Package processing pipeline into AWS Lambda functions
Set up API Gateway routes matching current API surface
Implement Entra ID JWT token validation in Lambda
Handle Excel generation in Lambda (presigned S3 URLs)
Write integration tests (pytest + moto for AWS mocks)

Required Skills

Python FastAPI Lambda boto3 API Gateway DynamoDB Entra ID JWT Pandas openpyxl pytest

🎨

Frontend Engineer

1 person · Full-time → Part-time · Phases 3–5 (6 weeks)

Key Responsibilities

Replace custom AuthContext with MSAL.js (@azure/msal-react)
Implement Entra ID SSO login/logout flows
Update API calls to point to API Gateway endpoint
Add data quality indicators and freshness badge
Deploy frontend to S3 + CloudFront (custom domain, HTTPS)
Handle token refresh and error states

Required Skills

React 18 TypeScript MSAL.js OAuth 2.0 / OIDC Vite S3 + CloudFront CSS REST APIs

📋

Project Manager / Scrum Master

1 person · Part-time (50%) · Phases 0–5 (14 weeks)

Key Responsibilities

Coordinate between team, SAP China, and M365 IT admin
Manage phased rollout timeline and dependency tracking
Facilitate UAT with customs operations team
Manage risk: SAP delays, schema changes, Entra ID access
Coordinate security review and compliance sign-off

Required Skills

Agile / Scrum Jira / Azure DevOps AWS Migration Stakeholder Mgmt Trade/Customs Domain

07 — Infrastructure Cost

AWS Running Cost Estimate

Estimated monthly AWS service costs after go-live, based on expected usage patterns.

5–7

Team Members

14 wk

Duration

6

Phases

$55–155

Est. AWS / Month

Post Go-Live Service Breakdown

Service	Usage	Est. Monthly Cost
Lambda	~10K invocations/day, 512MB, 10s avg	$15–50
API Gateway	REST API, ~300K requests/month	$3–10
DynamoDB	On-demand, 5 tables, ~50K reads/day	$10–30
S3 (all buckets)	~50GB raw + processed + audit	$5–15
S3 Glacier	Growing archive over 7 years	$1–5
Glue	1 daily job, 2 DPU, ~5 min	$15–25
CloudFront	Frontend CDN, low traffic	$1–5
CloudWatch	Logs, metrics, alarms	$5–15
TOTAL		$55–155/mo

Ergomotion IOR AWS Migration Project Plan

What You Have Today

Current vs. Final State

What Carries Forward (No Rebuild Needed)

AWS Target Architecture

14-Week Implementation Gantt

Critical Path Dependencies

Task Breakdown by Phase

People & Skills Required

AWS Running Cost Estimate

Post Go-Live Service Breakdown

Enter Admin Password

Ergomotion IOR
AWS Migration
Project Plan