VSLAM

Intro

Here's an easy-to-understand Visual Simultaneous Localization And Mapping (VSLAM) algorithm.

I made a whole youtube video series to explain it.

If you want to quickly get to the meat of the code, go to vslam/frontend.py and read Frontend.track() function - that's what gets called on every iteration to resolve pose from image input.

It works on top of data coming from kinda-easy-to-understand triangle-based scene rendering from scratch.

The VSLAM part of this repo is largely reinterpretation of tutorials presented in an excellent book "Introduction to Visual SLAM: From Theory to Practice". See the associated github repo provided by the authors of the book. They also generously provided pdf of the book itself in a realted repo. This is very awesome, and I am grateful for that.

Main entry point into VSLAM demo is:

pipenv shell
python -m lessons.ex_03_full_frontend

This live-renders the environment. That's a bit too slow to feel real-time (at around 1 fps). It's better to first pre-generate the data by python -m sim.run and the run it from saved data.

Structure

vslam
- lessons - scripts that run the framework piece by piece
- vslam - the vslam library - doesn't use jax.
  - keyframe - contains the most important functions that drive the SLAM algorithm
    - def estimate_keyframe()
    - def estimate_pose_wrt_keyframe()
  - frontend - Is the primary VSLAM state holder. Pulls everything together.
- sim - rendering framework. Uses jax.
  - egocentric_render - contains the most important functions that drive rendering
    - def parallel_z_buffer_render() - that's the function that does the object drawing

Installing

I recommend using pipenv.

python -m pip install pipenv
pipenv --python 3.10   # this assumes that you have python 3.10 installed
pipenv shell
pip install -r requirements.txt

To briefly summarize, we depend mostly on attrs, numpy, jax (!), opencv-python, pandas and scipy. jax is not critical and could be done away with in favour of numpy, but it keeps things fast.

Rendering

I made a small triangle rendering library to make data for VSLAM.

This way we fully control the data coming into the algorithm and we get to learn the camera equations from the "inverse problem" side. Indeed, rendering is inverse of SLAM in a way. Below command runs the entry point of the interactive rendering code.

pipenv shell
python -m sim.render

Sorry it's too slow to feel smooth to humans! It seems we would need to rewrite in c++ to be proper fast.

Jax doesn't like modifying arrays in place.

Experiment with WSAD, QE and arrow keys.
Escape to quit.
Change __main__ to save data. One eye image looks more or less like this:

Here's an older toy render of a cube.

Name		Name	Last commit message	Last commit date
Latest commit History 315 Commits
.vscode		.vscode
doc		doc
imgs		imgs
lessons		lessons
liegroups		liegroups
sim		sim
utils		utils
vslam		vslam
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
WORK-LOG.md		WORK-LOG.md
defs.py		defs.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VSLAM

Intro

Structure

Installing

Rendering

About

Releases

Packages

Languages

License

ghostFaceKillah/vslam

Folders and files

Latest commit

History

Repository files navigation

VSLAM

Intro

Structure

Installing

Rendering

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages