This application is designed to capture annotated datasets for 3D object detection (Objectron format), specifically for beverage cans.
- Ground Plane Detection: Uses ARCore to find surfaces.
- Adjustable Bounding Box: Control scale(X,Y,Z) and rotation(Yaw) to fit the physical object.
- Dataset Export: Saves frames as JPEG and annotations as
annotations.json. - Objectron Compatible: Exports 9 keypoints (1 center + 8 corners) in 2D and 3D.
- Local Space: The unit cube is defined from -0.5 to 0.5 in all axes.
- World Space: Set by ARCore's internal tracking.
- Camera Space: Transformed using the ARCore View Matrix.
keypoints_3dare stored in this space (meters). - Image Space: Projected using Camera Intrinsics.
keypoints_2dare stored as[x_norm, y_norm, depth].
The annotations.json file contains a list of frame objects:
{
"frame_id": 0,
"image": "frame_000.jpg",
"keypoints_2d": [[x_norm, y_norm, depth], ...],
"keypoints_3d": [[x_cam, y_cam, z_cam], ...],
"visibility": [1.0, ...],
"camera_intrinsics": { "fx": ..., "fy": ..., ... },
"view_matrix": [...],
"model_matrix": [...],
"timestamp": ...
}- 0: Center (0,0,0)
- 1-4: Front face corners
- 5-8: Back face corners
-
Model Matrix:
$M = T \cdot R \cdot S$ -
Keypoint World:
$P_{world} = M \cdot P_{local}$ -
Keypoint Camera:
$P_{camera} = V \cdot P_{world}$ -
Keypoint Image:
$x_{pixel} = fx \cdot (x_{cam} / -z_{cam}) + cx$
- Open the app and scan the floor until dots appear.
- Tap on the floor to place the bounding box.
- Use the - / + buttons to adjust Scale, Rotation, and Translation in all 3 axes (X, Y, Z) to match your beverage can precisely.
- Tap "START RECORDING".
- Move the phone slowly around the can (360 degrees, different heights) to capture all angles.
- The app captures frames at up to 60fps with synchronization between images and matrices.
- Tap "STOP RECORDING" when finished (suggested 12-15 seconds).
- Tap "EXPORT ZIP" to bundle the dataset.
- Pull the
.zipfrom/Android/data/com.example.arcoreapp/files/Pictures/.