Ai Dreams Forum

Member's Experiments & Projects => AI Programming => Topic started by: frankinstien on July 01, 2020, 07:58:23 pm

Title: LSD-SLAM: Large-Scale Direct Monocular SLAM
Post by: frankinstien on July 01, 2020, 07:58:23 pm
Has anyone tried this solution, LSD-SLAM (https://vision.in.tum.de/research/vslam/lsdslam?redirect=1) or know of a better one? I want to do something like the HoloLens using something like these pair of AR glasses (https://www.amazon.com/Headset-Glasses-Augmented-Reality-Android/dp/B07N2PXK8W/ref=psdc_14775002011_t1_B07BGZ8Z81). Where the A.I. can communicate with me through headphones and a visual screen that is superimposed on my field of view.

Title: Re: LSD-SLAM: Large-Scale Direct Monocular SLAM
Post by: infurl on July 02, 2020, 12:32:59 am
This looks fascinating. I wonder if it could leverage video compression algorithms, or directly in hardware.
Title: Re: LSD-SLAM: Large-Scale Direct Monocular SLAM
Post by: silent one on July 02, 2020, 01:43:55 am
Getting 3d using a single camera is exactly the same as stereo except you don't know what the camera distance between the two shots is.
So you maybe can brute force for it?
Title: Re: LSD-SLAM: Large-Scale Direct Monocular SLAM
Post by: infurl on July 02, 2020, 02:18:27 am
Getting 3d using a single camera is exactly the same as stereo except you don't know what the camera distance between the two shots is. So you maybe can brute force for it?

The camera images would have to be integrated with data from a motion sensor so the velocity could be estimated. I haven't had time to go through the article in detail yet but that's how I imagine it works. For what it's worth, the human eye doesn't process static images either. Even when we are focused on one thing, our eyes are constantly moving imperceptibly and it's the changes that register with our brains.
Title: Re: LSD-SLAM: Large-Scale Direct Monocular SLAM
Post by: silent one on July 02, 2020, 04:47:10 am
Yeh, with a motion sensor it would be much easier.  Then you know the camera positions,  and then the depth map is discoverable.  I was just saying its nearly exactly the same without the motion sensor, if you managed to try all positions away from the origin, last camera position and picked the best.  Its a lot more work tho for the pc, best suited to some iterant intensive gpu code.