Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map area clipping issue #7

Open
sxwee opened this issue Jul 28, 2023 · 6 comments
Open

Map area clipping issue #7

sxwee opened this issue Jul 28, 2023 · 6 comments

Comments

@sxwee
Copy link

sxwee commented Jul 28, 2023

I followed your processing procedure to handle the Porto dataset, but when I ran it on RNTrajRec, I encountered a KeyError. It seems that the road segments corresponding to the trajectory points obtained using HMM are not within the specified area of the Porto road network (Not belong to the valid edge set). Have you encountered this situation before? How should I handle this?
Error Info:

Traceback (most recent call last):
  File "multi_main.py", line 244, in <module>
    train_dataset = Dataset(rn, traj_root, mbr, args, 'train')
  File "/xxx/RNTrajRec-master/dataset.py", line 41, in __init__
    self.get_data(trajs_dir, parameters.win_size, parameters.ds_type, parameters.keep_ratio)
  File "/xxx/RNTrajRec-master/dataset.py", line 117, in get_data
    keep_ratio=1)
  File "/xxx/RNTrajRec-master/dataset.py", line 164, in parse_traj
    mm_gps_seq, mm_eids, mm_rates = self.get_trg_seq(tmp_pt_list)
  File "/xxx/RNTrajRec-master/dataset.py", line 240, in get_trg_seq
    mm_eids.append([self.rn.valid_edge_one[candi_pt.eid]])  # keep the same format as seq
KeyError: 8988

Furthermore, I'm curious why you chose the area (41.111975, -8.667057, 41.177462, -8.585305)? Is it determined based on the latitude and longitude of the trajectory points in the training, validation, and test sets?
I look forward to receiving your assistance!

@chenyuqi990215
Copy link
Owner

The area chosen contains most of the trajectories. As we mentioned in the paper, we only keep the urban area for training to save memory costs.

The issue occurs since road segment #8988 does not fall inside the area. You can either remove these trajectories from your training dataset or revise the training area in multi_main.py in case you have a GPU with enough memory.

@sxwee
Copy link
Author

sxwee commented Jul 31, 2023

Thank you for your advice. I have resolved the previous issues, but now I have encountered another bug. When I train the model, the following assertion throws a new error:

if not (output == hidden).all():
    import pdb
    pdb.set_trace()
assert (output == hidden).all()

Do you know why this error occurs?

@chenyuqi990215
Copy link
Owner

maybe the network contains NAN outputs, you can check where the NAN comes from.

@sxwee
Copy link
Author

sxwee commented Aug 3, 2023

I have followed your data preprocessing procedure as outlined below:

raw GPS trajectories(A)
A---------interpolate_trajectory.py-------->B
A---------hmm.py---------project_trajectory.py--------->C---------epsilon_trajectory.py--------->D
B as input, D as output

However, I have noticed that there is a significant difference in the GPS coordinates between the trajectory points in B and D. Is this considered normal?
diff

Furthermore, I have also utilized the tptk module to clean and filter the trajectories. Is this step necessary?

@customtiy13
Copy link

customtiy13 commented Sep 1, 2023

Thank you for your advice. I have resolved the previous issues, but now I have encountered another bug. When I train the model, the following assertion throws a new error:

if not (output == hidden).all():
    import pdb
    pdb.set_trace()
assert (output == hidden).all()

Do you know why this error occurs?

I have encountered the same error. I printed out the output, and it's all Nan. Did you found the solution yet?

@chenyuqi990215
Copy link
Owner

chenyuqi990215 commented Sep 2, 2023

Please check the following things:

  1. Replace search_dist in multi_main.py Line 135 and neighbor_dist in Line 136 with a large enough value to see if the NAN still occurs. If NAN is resolved, then it is caused by improper search_dist. Please set search_dist to the largest possible distance error between the input GPS point and the target road segment and neighbor_dist to a value much larger than search_dist.
  2. Check where the NAN comes from, does the input contain NAN, or do network parameters contain NAN?

We are glad to dig into the issue with you guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants