timetrvlr-assets
Sync timetrvlr public assets (embeddings, index files) to/from S3. Use this skill before deploying to ensure S3 has the latest assets. Invoke with /timetrvlr-assets.
When & Why to Use This Skill
This Claude skill automates the management and synchronization of large public assets—such as machine learning embeddings and index files—between local environments and AWS S3. It streamlines the deployment workflow for AWS Amplify applications by ensuring data consistency, managing S3 assets, and triggering build jobs, significantly enhancing CI/CD efficiency for data-intensive web projects.
Use Cases
- Pre-deployment synchronization: Automatically upload the latest machine learning embeddings and index files to S3 to ensure AWS Amplify has access to current data during the build phase.
- Deployment automation: Trigger AWS Amplify release jobs and monitor real-time build status and logs directly through the agent to streamline the production update process.
- Environment parity: Quickly download production assets from S3 to local development environments to ensure consistency during testing and debugging.
- Infrastructure monitoring: List S3 bucket contents and verify AWS SageMaker endpoint status to ensure backend services are operational and correctly configured.
| name | timetrvlr-assets |
|---|---|
| description | Sync timetrvlr public assets (embeddings, index files) to/from S3. Use this skill before deploying to ensure S3 has the latest assets. Invoke with /timetrvlr-assets. |
TimeTraveler Asset Management & Deployment
This skill manages the large public assets for timetrvlr_client and handles deployment to AWS Amplify.
Amplify App Details
| Property | Value |
|---|---|
| App ID | d1t9je67akz719 |
| Branch | timetrvlr-amplify |
| Region | eu-west-2 |
| Live URL | https://timetrvlr-amplify.d1t9je67akz719.amplifyapp.com |
URL Format: https://<branch>.<app-id>.amplifyapp.com
Managed Assets
| File | Size | Description |
|---|---|---|
iiif_no_text_embedding_index.json |
~25MB | Master index mapping image IDs to metadata |
iiif_no_text_embedding_matrix_vlm_embed_ae3d_hires_1.npy |
~2.5MB | 3D coordinates from AE3D projection |
runs/vlm_embed_ae3d_hires_1/ae.pt |
~6MB | AE3D model checkpoint |
Commands
Upload Assets to S3 (Before Deploy)
IMPORTANT: Run this before deploying timetrvlr to ensure Amplify has the latest assets.
cd /home/ubuntu/wc_simd
uv run python aws/upload_to_s3.py --overwrite \
demos/timetrvlr/timetrvlr_client/public/iiif_no_text_embedding_index.json
uv run python aws/upload_to_s3.py --overwrite \
demos/timetrvlr/timetrvlr_client/public/iiif_no_text_embedding_matrix_vlm_embed_ae3d_hires_1.npy
uv run python aws/upload_to_s3.py --overwrite \
runs/vlm_embed_ae3d_hires_1/ae.pt
Or use the helper script:
cd /home/ubuntu/wc_simd/demos/timetrvlr/timetrvlr_client
./scripts/download_data.sh up
Download Assets from S3 (For Local Dev)
cd /home/ubuntu/wc_simd/demos/timetrvlr/timetrvlr_client
./scripts/download_data.sh down
Or individually:
cd /home/ubuntu/wc_simd
uv run python aws/upload_to_s3.py --download --overwrite \
demos/timetrvlr/timetrvlr_client/public/iiif_no_text_embedding_index.json
List Current S3 Assets
cd /home/ubuntu/wc_simd
uv run python aws/upload_to_s3.py --list demos/timetrvlr/timetrvlr_client/public --sizes
Verify Assets Exist Locally
ls -lh demos/timetrvlr/timetrvlr_client/public/*.json demos/timetrvlr/timetrvlr_client/public/*.npy
S3 Location
- Bucket:
wellcomecollection-dsim - Prefix:
wc_simd/demos/timetrvlr/timetrvlr_client/public/ - Full URI:
s3://wellcomecollection-dsim/wc_simd/demos/timetrvlr/timetrvlr_client/public/
How Amplify Uses These Assets
During amplify.yml preBuild phase:
sync_public_assets.shdownloads assets fromASSET_S3_ROOT- Assets are placed in
public/directory - Next.js build includes them as static files
Deploy to Amplify
Full Deploy (Upload Assets + Trigger Build)
cd /home/ubuntu/wc_simd
# 1. Upload assets to S3
cd demos/timetrvlr/timetrvlr_client && ./scripts/download_data.sh up
cd /home/ubuntu/wc_simd
# 2. Trigger Amplify build
aws amplify start-job \
--app-id d1t9je67akz719 \
--branch-name timetrvlr-amplify \
--job-type RELEASE \
--region eu-west-2
Monitor Build Status
# Get latest job status
aws amplify get-job \
--app-id d1t9je67akz719 \
--branch-name timetrvlr-amplify \
--job-id <JOB_ID> \
--region eu-west-2 \
--query "job.summary.status" \
--output text
# List recent jobs
aws amplify list-jobs \
--app-id d1t9je67akz719 \
--branch-name timetrvlr-amplify \
--region eu-west-2 \
--max-results 5 \
--query "jobSummaries[].{jobId:jobId,status:status,startTime:startTime}" \
--output table
Check Build Logs (if build fails)
aws amplify get-job \
--app-id d1t9je67akz719 \
--branch-name timetrvlr-amplify \
--job-id <JOB_ID> \
--region eu-west-2 \
--query "job.steps[].{name:stepName,status:status,logUrl:logUrl}"
Typical Workflow
After Generating New Embeddings
Run AE3D inference to generate new 3D coords:
./scripts/infer_ae3d.sh runs/ae3d_*/ae.pt data/vlm_embed/embeddings.npy data/vlm_embed/index.parquetUpload new assets to S3:
cd demos/timetrvlr/timetrvlr_client ./scripts/download_data.sh upTrigger Amplify build:
aws amplify start-job --app-id d1t9je67akz719 --branch-name timetrvlr-amplify --job-type RELEASE --region eu-west-2
Quick Deploy (Assets Already Current)
If assets haven't changed, just trigger a rebuild:
aws amplify start-job --app-id d1t9je67akz719 --branch-name timetrvlr-amplify --job-type RELEASE --region eu-west-2
Troubleshooting
SSO Session Expired
The upload script auto-handles SSO login. If prompted, complete the browser auth flow.
Asset Not Found in S3
Check the exact path matches:
uv run python aws/upload_to_s3.py --list demos/timetrvlr/timetrvlr_client/public --sizes
Amplify Build Fails on Asset Download
Verify ASSET_S3_ROOT env var in Amplify matches S3 location, and IAM role has s3:GetObject permission.
Build Stuck or Failed
- Check build logs:
aws amplify list-jobs --app-id d1t9je67akz719 --branch-name timetrvlr-amplify --region eu-west-2 --max-results 3 - Get detailed step status for failed job:
aws amplify get-job --app-id d1t9je67akz719 --branch-name timetrvlr-amplify --job-id <JOB_ID> --region eu-west-2
Site Not Updating After Deploy
- Clear browser cache or use incognito
- Verify build status is
SUCCEED - Check the correct URL: https://timetrvlr-amplify.d1t9je67akz719.amplifyapp.com
Backend Deployment (SageMaker)
Backend Details
| Property | Value |
|---|---|
| Endpoint | EmbeddingEndpoint-u6w61sZPU1fj |
| ECR Repo | 760097843905.dkr.ecr.eu-west-2.amazonaws.com/embed-inference |
| Instance | ml.g4dn.xlarge |
| API Gateway | https://zymevperp0.execute-api.eu-west-2.amazonaws.com/embed |
Rebuild and Deploy Backend
When updating the AE3D checkpoint or backend code:
cd /home/ubuntu/wc_simd/demos/timetrvlr/timetrvlr_backend
# 1. Copy latest AE checkpoint
./copy_ae_ckpt.sh
# 2. ECR login
aws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin 760097843905.dkr.ecr.eu-west-2.amazonaws.com
# 3. Build and push (use --no-cache if dependencies changed)
docker build -t embed-inference:latest .
docker tag embed-inference:latest 760097843905.dkr.ecr.eu-west-2.amazonaws.com/embed-inference:latest
docker push 760097843905.dkr.ecr.eu-west-2.amazonaws.com/embed-inference:latest
# 4. Update SageMaker endpoint (see /deploy skill for full commands)
Known Issues
transformers Version Conflict
The GME-Qwen2-VL model requires transformers<4.52.0. If you get errors like:
transformers<4.52.0 is required but found transformers==4.57.3
Fix in requirements.txt and Dockerfile:
transformers>=4.37.0,<4.52.0
504 Errors / "Service Unavailable"
Cause: SageMaker auto-scaling is set to MinCapacity=0, so the endpoint scales to zero after inactivity.
Check endpoint status:
aws sagemaker describe-endpoint --endpoint-name EmbeddingEndpoint-u6w61sZPU1fj --region eu-west-2 --query 'EndpointStatus'
If status is "Updating", wait for "InService".
Fix: Prevent scale-to-zero (keeps 1 instance always running):
aws application-autoscaling register-scalable-target \
--service-namespace sagemaker \
--resource-id "endpoint/EmbeddingEndpoint-u6w61sZPU1fj/variant/AllTraffic" \
--scalable-dimension "sagemaker:variant:DesiredInstanceCount" \
--min-capacity 1 --max-capacity 1 \
--region eu-west-2
Cost: ~$380/month for always-on ml.g4dn.xlarge.
Test Backend Directly
# Test API Gateway
curl -X POST "https://zymevperp0.execute-api.eu-west-2.amazonaws.com/embed" \
-H "Content-Type: application/json" \
-d '{"texts": ["medical illustration"], "instruction": "Find an image that matches the given text."}'
# Test SageMaker async directly
S3_KEY="test-$(date +%s).json"
echo '{"texts": ["test"], "instruction": "Find an image."}' | aws s3 cp - "s3://embeddingendpointstack-asyncinputbucketc9af2d68-kutlnuucvpqq/$S3_KEY" --region eu-west-2
aws sagemaker-runtime invoke-endpoint-async \
--endpoint-name EmbeddingEndpoint-u6w61sZPU1fj \
--content-type application/json \
--input-location "s3://embeddingendpointstack-asyncinputbucketc9af2d68-kutlnuucvpqq/$S3_KEY" \
--region eu-west-2
Full Deploy Checklist
Frontend assets (3D coords, index):
cd demos/timetrvlr/timetrvlr_client && ./scripts/download_data.sh upBackend checkpoint (if AE3D model changed):
cd demos/timetrvlr/timetrvlr_backend && ./copy_ae_ckpt.sh # Then rebuild and push Docker imageTrigger Amplify build:
aws amplify start-job --app-id d1t9je67akz719 --branch-name timetrvlr-amplify --job-type RELEASE --region eu-west-2Update SageMaker (if backend code/checkpoint changed): See
/deployskill for SageMaker endpoint update commands.