To localize a landmark from image input.
Configure the codespace with 4-core CPU for better performance (optional).
This Codespaces is configured with NVIDIA CUDA that uses Codespace's GPU. It will automatically install cudnn and other required dependencies. Configuration file in devcontainer.json
Note
Free users have 120 core-hours per month and Pro users have 180 core-hours per month on GitHub Codespaces. The default codespace runs on a 2-core machine, so that's 60 hours (or 90 hours) of free usage per month before getting charged. Make sure to stop your codespace when you're not using it (it automatically stops after 30 minutes of inactivity by default). See more pricing details here.
Python version at the time of writing: 3.10.13
Run
sudo apt update && sudo apt install -y libsm6 libxext6 ffmpeg libfontconfig1 libxrender1 libgl1-mesa-glx
Download & install custom Hierarchical Localization package. (Slightly modified from version 1.3)
git clone --recursive https://github.com/KOE-Wayfind/Hierarchical-Localization
cd Hierarchical-Localization
pip install -e .
pip install --upgrade plotly
cd ..
Download KOE Image Dataset
git clone https://github.com/KOE-Wayfind/koe-datasets.git
python my_hloc.py
This process would take roughly about 8-10 minutes. It will construct the model from the given image dataset.
At the end, you would get something like:
[2023/07/04 02:47:32 hloc INFO] Reconstruction statistics:
Reconstruction:
num_reg_images = 57
num_cameras = 1
num_points3D = 2656
num_observations = 14258
mean_track_length = 5.36822
mean_observations_per_image = 250.14
mean_reprojection_error = 0.971601
num_input_images = 70
Note
The root requirements.txt
constraints the pycolmap version to 0.4.0
, which will override the hloc dependencies. See issue #3.
pip install -r requirements.txt
python app.py
The server will run on localhost:5000
Use the Thunder Client installed (the lightning icon). Import the sample thunder-collection_Hloc API.json
to Thunder Client.
Try run the Hloc API/Localize me
request.
The image_data
in the example request is this image:
So, the expected result is as follows:
{
"result": "Conference Room A"
}
To test with your image, decode the image to Base64. You might want to resize to 512*512 px first. You can use Image to Base64 encoder
Note
We are running a development server. Ideally we would want to deploy the server to WSGI server environ (learn more here). But since we are just testing, then I think it is okay. 🙈
To assign this server to KOE-Wayfinder App, follow the steps below.
Go to PORTS tab.
Find the port 5000
from the list (named Server Endpoint). Change the port visibility from private to public.
On the same port information, copy its address. Example: https://iqfareez-improved-fishstick-q45gqxjgrjg24qqx-5000.preview.app.github.dev/
You have two options, either update from the source code or from the app settings.
From source code, navigate to Assets/Scripts/LocalizationSettings.cs
and update serverUrl
player prefs default value.
Or, in app in localization page, open the localization setting and update the URL directly.
- Sometimes, when you send image from client, the server will respond
400
error or other error. Try kill and restart the server (you may need to do it few times), the server will fix itself. You can also test with Insomnia or Postman to debug the request. - The inferencing is slow, which is about 10-20 seconds.
- The server can handle only one request at a time. Well technically this Flask server can handle concurrent users, but the
image
endpoint (localization process) can only run single process at one time.