libfreenect2 swift ui test app (requires Kinect v2)
- Swift 90.7%
- C++ 7.5%
- C 0.9%
- Shell 0.9%
| Sources | ||
| build.sh | ||
| CLAUDE.md | ||
| Package.swift | ||
| README.md | ||
KinectViewer
Small macOS SwiftUI viewer for Kinect v2 streams through libfreenect2.
It displays:
- RGB color
- Depth as a false-color distance map
- Separate point-cloud tab built from libfreenect2's registered XYZRGB projection, with camera rotation, denser screen-space splats, surface reconstruction, and render controls
- Infrared intensity
- Registered color aligned to the depth camera
- Optional Apple Vision face boxes and hand landmarks over the RGB stream
- A scene panel with "What the camera sees" and "Detected activity" summaries
- Click the scene description area or the
Full textcontrol to expand the complete generated text, including the selected depth description - Click-to-measure depth: click the depth or registered stream to measure that point in millimeters/meters, with a local median and mean around the clicked pixel
- Selected crop panel: displays a registered RGB crop around the click where depth-similar pixels are cyan-tinted and other depths are dimmed
- Selected-object description: the app classifies a registered RGB crop around the clicked depth point and combines those local labels with the depth measurement
- Drag panels in the stream grid to reorder them; the order is saved across app restarts, and
Reset Layoutrestores the default order - The stream grid keeps visible scroll indicators and depth measurement clicks no longer consume scroll drags
- A packaged app icon loaded at launch from
Resources/AppIcon.icns
Build and Run
From this repository:
cd ui
swift run KinectViewer
To build a standalone .app bundle with the icon and embedded libfreenect2 dependencies:
cd ui
./build.sh
open dist/KinectViewer.app
The package links to the existing libfreenect2 build in ../build/lib, so build libfreenect2 first if that directory is missing or stale.
Notes
- The default
Autopipeline lets libfreenect2 choose the packet pipeline. CPU,OpenGL, andOpenCLare available from the picker, but the selected pipeline must be enabled in the local libfreenect2 build.- The C++ bridge copies each frame into Swift-friendly RGBA buffers, so SwiftUI never owns libfreenect2 frame memory.
- The point-cloud tab renders through a Metal-backed
MTKView: dots and splats use sampled XYZRGB points, while Surface uses an organized undistorted depth + registered-color frame and rejects triangles across depth discontinuities in the Metal vertex path. - Point-cloud controls include screen-space splat, surface, and dot rendering, splat radius/surface fill, brightness, dot size, RGB/depth/solid coloring, display density, max-depth clipping, built-in or file-based HDRI scene environments, enabled-by-default depth/scene/completion/occlusion features, stats overlay visibility, keyboard browsing with arrow keys, and camera reset.
- The Background panel can capture an empty-room point cloud and switch to People view, which composites the captured background with live foreground points that are closer than the captured background by the selected threshold. Captured background opacity and the people separation threshold are adjustable.
- The RGB Scene tab supports automatic sparse-RGB feature tracking with interpolated yaw, pitch, roll, and translation estimates, then uses RGB-D ICP to refine 6-DOF keyframe poses before merging them into a voxel scene. Guided capture still lets you add the current view or add left/right stepped views, so scene building does not depend entirely on RGB tracker confidence. After adding views,
Align Capturesruns a background post-capture ICP pass over the stored views, shows progress, and rebuilds the scene with corrected poses. - Completion combines a temporal background cache with same-row organized-depth inpainting in Live view, so occluded wall pixels are filled only when supported by previously visible background samples or neighboring wall samples in the current Kinect depth grid.
- Recognition modes run on the RGB image in Swift with Apple's Vision framework. They do not change the Kinect capture pipeline.
- Scene summaries are generated from structured RGB-D observations: face counts, hand counts, RGB image-classification labels, and the clicked depth measurement. On macOS versions with Apple's Foundation Models framework available, the app uses
LanguageModelSession; otherwise it falls back to deterministic Vision/depth summaries. - The Scene panel shows which path produced the latest text:
FoundationModelswhenLanguageModelSessionwas used, orFallbackwhen the deterministic Vision/depth summary was used. - If FoundationModels cannot run, the Scene panel shows the concrete availability or generation error reported by Apple APIs.
- The selected depth description uses the clicked point distance, local depth mean, sample coverage, current RGB scene labels, and local labels from the registered RGB crop around the selected point.
- The Kinect bridge tears down streams idempotently and deletes the libfreenect2 device after stopping, so the device destructor owns the final close path.
- Closing the app uses a fast termination path that stops streaming without waiting on libfreenect2's USB close/release path.