Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom label dropdowns from the dynamic config #89

Closed
shankari opened this issue Aug 31, 2023 · 65 comments
Closed

Support custom label dropdowns from the dynamic config #89

shankari opened this issue Aug 31, 2023 · 65 comments

Comments

@shankari
Copy link
Contributor

In a recent phone change, we started supporting a custom set of labels in the dropdown.

The custom set of labels is specified as a separate file linked from the dynamic config, and includes the kgCO2/km for each mode.
e-mission/e-mission-docs#945

However, this change has not yet been implemented in the public dashboard. The public dashboard still reads the user-specific mappings and the CO2 and energy equivalents from files hardcoded in this repo (notably in viz_scripts/auxiliary_files).

This means that the custom labels will be mapped to Other, which in turn, means that the public dashboard becomes less meaningful. This is likely to be particularly challenging in international contexts - e.g.
https://usaid-laos-ev-openpath.nrel.gov/public/

The public dashboard should read the values from the dynamic config as well if they are present, and fall back to defaults if they are not.

Note that depending on how we convert between energy and emissions, we may have to have the energy equivalent stored in the trip_confirm lists as well. You can submit a PR (similar to e-mission/nrel-openpath-deploy-configs#32) to add them if required.

@ananta-nrel can you please handle this?

@shankari
Copy link
Contributor Author

This is likely to be particularly challenging in international contexts - e.g.
https://usaid-laos-ev-openpath.nrel.gov/public/

Note, however, that Laos is not currently using custom labels - this is likely because the user truly added an custom mode (e.g. "motorcycle" or "rickshaw") by selecting "other" from the dropdown and then kept using it. We will not tackle that use case now, only custom labels specified by program admins and included in the dynamic config.

@shankari shankari moved this to Issues being worked on in OpenPATH Tasks Overview Sep 1, 2023
@shankari
Copy link
Contributor Author

shankari commented Sep 1, 2023

suggested steps going forward

  • look at generic metrics
  • when we generate the graphs, we show the proportion of Gas car, drove alone
  • if you read the raw data, it will have drove_alone
  • poke around at the code and tell me where that mapping happens

@iantei
Copy link
Contributor

iantei commented Sep 1, 2023

We are creating a dictionary with key as replaced_mode: drove_alone with values as mode_clean: "Gas car, drove alone" in mapping_dictionaries.ipynb, which is stored as dic_re = dict(zip(df_re['replaced_mode'],df_re['mode_clean'])) # bin modes.
This is being retrieved in generic_metrics.ipynb - %store -r dic_re.

@shankari
Copy link
Contributor Author

shankari commented Sep 1, 2023

Close, but that is for the replaced mode, not for the mode. The very first figure, for example, does not use the replaced mode at all.

@iantei
Copy link
Contributor

iantei commented Sep 1, 2023

The saved dic_re is being passed into scaffolding.load_viz_notebook_data(..., dic_re, ...).
Inside scaffolding.py: load_viz_notebook_data(..., dic_re, ...):
Now the passed dic_re being used in ::
expanded_ct['Mode_confirm']= expanded_ct['mode_confirm'].map(dic_re)
The above statement will map expanded_ct['mode_confirm'] to the key of dic_re. E.g. if the value of expanded_ct['mode_confirm'] is "drove alone", and the dic_re holds key:value as <drove_alone: "Gas car, drove alone">. The value of expanded_ct['Mode_confirm'] will become "Gas car, drove alone". Subsequently, other rows for expanded_ct['Mode_confirm'] would also be appended.
So, the new column "Mode_confirm" is introduced to expanded_ct, would have the value of "Gas car, drove alone" Similar mapping would add other values to "Mode_confirm".
The first figure uses Mode_confirm.

@shankari
Copy link
Contributor Author

shankari commented Sep 1, 2023

so you now need to read the mapping (the equivalent of dic_re) from the dynamic labels if the dynamic config indicates that the study/program uses dynamic labels. Submit a draft PR for that change.

@iantei
Copy link
Contributor

iantei commented Sep 2, 2023

The approach I'm undertaking is the following:

  1. Inside, generate_plots.py: Identify whether the dynamic labels are present on the json (dynamic_config) or not. [check for label_options in dynamic_config]
    Set a bool has_dynamic_labels as True is its present.
    Unravel the json associated with label_options. Set it to a variable.
    Pass these variables to the notebook through nbp.parameter_values
  2. Extract these values in the notebook.
    In case, the has_dynamic_labels is true. Use the passed json to extract and fill up the Mode_confirm.

For the development mode, can we execute the generate_plots.py further calling the respective notebook (generic_metrics.ipynb)? While trying this it looked up for few modules from emission.storage, which it wouldn't be able to access.

So, I understand an approach would be to take up the variables directly at the starting of the notebook, and execute. Is there an alternative way other than that for a development mode testing?

@iantei
Copy link
Contributor

iantei commented Sep 3, 2023

There is a disparity in the mode_labels.csv and example-program-label-options.json referred from dev-emulator-program.nrel-op.json mapping.
In mode_labels.csv, we have mode_confirm[drove_alone] -> mode_clean["Gas Car, drove alone"].
Similarly, in the above example-program-label-options.json file, in the translations[en] - drove_alone maps to "Gas Car Drove Alone".
This mismatch would result in an issue with the plots.py: pie_chart_mode(plot_title, labels, values, file_name) function. Where for the newly mapped labels - containing keys from Mode_confirm (which have been mapped with the dynamic_labels approach than with dic_re). Calling colours[key] would result in crash - thus no Pie Chart diagram.

all_labels= ['Gas Car, drove alone',
'Bus',
'Train',
'Free Shuttle',
'Taxi/Uber/Lyft',
'Gas Car, with others',
'Bikeshare',
'Scooter share',
'E-bike',
'Walk',
'Skate board',
'Regular Bike',
'Not a Trip',
'No Travel',
'Same Mode',
'E-car, drove alone',
'E-car, with others',
'Air',
'Other']
colours = dict(zip(all_labels, plt.cm.tab20.colors[:len(all_labels)]))
colors=[colours[key] for key in labels] -- this would crash, since it looks up for the key "Gas Car Drove Alone" instead of "Gas Car, drove alone".

I understand the fix would be to make appropriate changes with example-program-label-options.json to match it with mode_labels.csv.

@shankari
Copy link
Contributor Author

shankari commented Sep 4, 2023

can we execute the generate_plots.py further calling the respective notebook (generic_metrics.ipynb)? While trying this it looked up for few modules from emission.storage, which it wouldn't be able to access.

Again, this is not very useful. What did you try to do to execute the generate_plots.py? What error did you get?
All the notebooks use modules from emission.storage, so I don't know what you mean by this statement.

I understand the fix would be to make appropriate changes with example-program-label-options.json to match it with mode_labels.csv.

Your understanding is incorrect. Please read the high-level goals in the issue. It is not "support example-program-label-options.json". It is "support custom labels for each program/study" - example-program-label-options.json is just an example.

Given that the list of labels can be custom, we can no longer hardcode the list of colors. I don't see why the list of colors needs to be hardcoded in the first place given that we are mapping it to plt.cm.tab20.colors anyway. We should just have all_labels be the list of models in the custom list instead of a hardcoded list.

Note that the list of modes now also specifies the basemode. As a stretch goal, you can use the number of modes of each base mode to get shades of the basemode color and be consistent with the phone UI. But that is a future enhancement after we get the basic custom mode display correct.

@iantei
Copy link
Contributor

iantei commented Sep 5, 2023

I was validating my changes with reference to the available example for "support custom labels for each program/study" which is example-program-label-options.json. There's a mismatch in the mapping provided in the above example file to the one present on the mode_labels.csv. Therefore, I pointed out, it'd be rightful to update the example-program-label-options.json accordingly.

@shankari
Copy link
Contributor Author

shankari commented Sep 5, 2023

@iantei you cannot update example-program-label-options.json. Our program partners can give us whatever they want in label-options.json and we have to be able to support it. That is the goal of this new feature.

Please read through my comments on "hardcoded list of colors" carefully.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

Testing done so far:
Scenario I: walk - with default mapping
Dataset used: vail_2022-05-09.tar.gz
STUDY_CONFIG=stage-program

Executed the following:

A. Executed for generic_metrics notebooks

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb
(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-27T23:40:59.466627+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-27T23:40:59.509457+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:41:24.704470+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:41:32.054426+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:41:40.135223+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:41:48.939444+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:41:56.065759+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-27T23:42:04.484404+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
image image image

B. Executed for generic_metrics_sensed notebook: [Gives Error]

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics_sensed.ipynb default

Result:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T00:08:46.554062+00:00 with args Namespace(plot_notebook='generic_metrics_sensed.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T00:08:46.601682+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Traceback (most recent call last):
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 106, in <module>
    compute_for_date(None, None)
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 103, in compute_for_date
    nbclient.execute(new_nb)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1305, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 166, in wrapped
    return loop.run_until_complete(inner)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 705, in async_execute
    await self.async_execute_cell(
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1058, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 914, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
expanded_ct, file_suffix, quality_text, debug_df = scaffolding.load_viz_notebook_sensor_inference_data(year,
                                                                            month,
                                                                            program,
                                                                            include_test_users,
                                                                            sensed_algo_prefix)
------------------

----- stdout -----
Loaded all confirmed trips of length 57407
----- stdout -----
After filtering, found 57407 participant trips
----- stdout -----
Loaded expanded_ct with length 57407 for None
------------------

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 expanded_ct, file_suffix, quality_text, debug_df = scaffolding.load_viz_notebook_sensor_inference_data(year,
      2                                                                             month,
      3                                                                             program,
      4                                                                             include_test_users,
      5                                                                             sensed_algo_prefix)

File /usr/src/app/saved-notebooks/scaffolding.py:189, in load_viz_notebook_sensor_inference_data(year, month, program, include_test_users, sensed_algo_prefix)
    187 print(f"Loaded expanded_ct with length {len(expanded_ct)} for {tq}")
    188 if len(expanded_ct) > 0:
--> 189     expanded_ct["primary_mode_non_other"] = participant_ct_df.cleaned_section_summary.apply(lambda md: max(md["distance"], key=md["distance"].get))
    190     expanded_ct.primary_mode_non_other.replace({"ON_FOOT": "WALKING"}, inplace=True)
    191     valid_sensed_modes = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN"]

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/series.py:4771, in Series.apply(self, func, convert_dtype, args, **kwargs)
   4661 def apply(
   4662     self,
   4663     func: AggFuncType,
   (...)
   4666     **kwargs,
   4667 ) -> DataFrame | Series:
   4668     """
   4669     Invoke function on values of Series.
   4670 
   (...)
   4769     dtype: float64
   4770     """
-> 4771     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py:1123, in SeriesApply.apply(self)
   1120     return self.apply_str()
   1122 # self.f is Callable
-> 1123 return self.apply_standard()

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py:1174, in SeriesApply.apply_standard(self)
   1172     else:
   1173         values = obj.astype(object)._values
-> 1174         mapped = lib.map_infer(
   1175             values,
   1176             f,
   1177             convert=self.convert_dtype,
   1178         )
   1180 if len(mapped) and isinstance(mapped[0], ABCSeries):
   1181     # GH#43986 Need to do list(mapped) in order to get treated as nested
   1182     #  See also GH#25959 regarding EA support
   1183     return obj._constructor_expanddim(list(mapped), index=obj.index)

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/_libs/lib.pyx:2924, in pandas._libs.lib.map_infer()

File /usr/src/app/saved-notebooks/scaffolding.py:189, in load_viz_notebook_sensor_inference_data.<locals>.<lambda>(md)
    187 print(f"Loaded expanded_ct with length {len(expanded_ct)} for {tq}")
    188 if len(expanded_ct) > 0:
--> 189     expanded_ct["primary_mode_non_other"] = participant_ct_df.cleaned_section_summary.apply(lambda md: max(md["distance"], key=md["distance"].get))
    190     expanded_ct.primary_mode_non_other.replace({"ON_FOOT": "WALKING"}, inplace=True)
    191     valid_sensed_modes = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN"]

TypeError: 'float' object is not subscriptable

C. Executed for generic_timeseries notebooks

Executed the following: ``` (emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_timeseries.ipynb default ```

Result:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T00:20:59.704644+00:00 with args Namespace(plot_notebook='generic_timeseries.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T00:20:59.740571+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:21:17.618467+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:21:23.388702+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:21:28.670491+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:21:34.563356+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

image

D. Executed for mode_specific_metrics notebooks

Executed the following code:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb default

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T00:26:54.014794+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T00:26:54.051267+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:27:10.550640+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:27:17.220960+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

E. Executed for mode_specific_timeseries notebooks [Gives error]

Executed the following:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_timeseries.ipynb default

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T00:37:18.773477+00:00 with args Namespace(plot_notebook='mode_specific_timeseries.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T00:37:18.804130+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:37:53.126661+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Traceback (most recent call last):
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 110, in <module>
    compute_for_date(month_year.month, month_year.year)
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 103, in compute_for_date
    nbclient.execute(new_nb)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1305, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 166, in wrapped
    return loop.run_until_complete(inner)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 705, in async_execute
    await self.async_execute_cell(
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1058, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 914, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
quality_text = scaffolding.get_quality_text(expanded_ct, mode_counts_interest, mode_of_interest, include_test_users)
------------------


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 quality_text = scaffolding.get_quality_text(expanded_ct, mode_counts_interest, mode_of_interest, include_test_users)

NameError: name 'mode_counts_interest' is not defined

F. Executed for energy_calculations notebooks

Executed the following: ``` python bin/generate_plots.py energy_calculations.ipynb default ```

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T00:46:59.894921+00:00 with args Namespace(plot_notebook='energy_calculations.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T00:46:59.934193+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:47:13.731884+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T00:47:17.549366+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

//Testing done so far: [DRAFT: Still updating]
Scenario II: moped - with default mapping
Dataset used: vali_2022-05-09.tar.gz
STUDY_CONFIG=stage-program

A. Change the mode_confirm "walk" to "moped":

Showcasing existing entries in the mongodb:

ashrest2-35384s:em-public-dashboard ashrest2$ docker exec -it em-public-dashboard-db-1 mongo
MongoDB shell version v4.4.0
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("727fcd12-c272-4a0f-9b2e-1f6a7ce489b2") }
MongoDB server version: 4.4.0
---
The server generated these startup warnings when booting: 
        2023-09-27T23:36:15.782+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
        2023-09-27T23:36:16.838+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> clear
uncaught exception: ReferenceError: clear is not defined :
@(shell):1:1
> show dbs
Stage_database  13.751GB
admin            0.000GB
config           0.000GB
local            0.000GB
> use Stage_database
switched to db Stage_database

Results:

> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"walk"}).count()
4359

Executed the below script to make changes on the mongoDb:

var result_walk = db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"walk"}, {"_id": 1})

var resultArray = result_walk.toArray();
for (var i = 0; i < resultArray.length; i++) {     var doc = resultArray[i];     var docId = doc._id;     var updateResult = db.Stage_analysis_timeseries.update(         { "_id": docId, "data.user_input.mode_confirm": "walk" },         { "$set": { "data.user_input.mode_confirm": "moped" } }     );     if (updateResult.nModified > 0) {         print(`Updated document with _id: ${docId}`);     } }

Result of above execution:

> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"walk"}).count()
0
> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"moped"}).count()
4359

B. Re-started the docker compose and executed the following:

Executed the following:

ashrest2-35384s:em-public-dashboard ashrest2$ docker exec -it em-public-dashboard-dashboard-1 /bin/bash
OCI runtime exec failed: exec failed: unable to start container process: exec: "/bin/bash": stat /bin/bash: no such file or directory: unknown
ashrest2-35384s:em-public-dashboard ashrest2$ docker ps
CONTAINER ID   IMAGE                         COMMAND                  CREATED       STATUS         PORTS                               NAMES
75ac05b4ded2   em-pub-dash-dev/frontend      "docker-entrypoint.s…"   2 hours ago   Up 2 minutes   0.0.0.0:3274->6060/tcp              em-public-dashboard-dashboard-1
61584fb3620f   em-pub-dash-dev/viz-scripts   "/bin/bash /usr/src/…"   2 hours ago   Up 2 minutes   8080/tcp, 0.0.0.0:47962->8888/tcp   em-public-dashboard-notebook-server-1
c850b9a6f83e   mongo:4.4.0                   "docker-entrypoint.s…"   4 weeks ago   Up 2 minutes   27017/tcp                           em-public-dashboard-db-1
ashrest2-35384s:em-public-dashboard ashrest2$ docker exec -it em-public-dashboard-notebook-server-1 /bin/bash
root@61584fb3620f:/usr/src/app# source setup/activate.sh
(emission) root@61584fb3620f:/usr/src/app# cd saved-notebooks/

Re-evaluated the status of mongodb:

ashrest2-35384s:em-public-dashboard ashrest2$ docker exec -it em-public-dashboard-db-1 mongo
MongoDB shell version v4.4.0
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("e07e8164-0440-4afc-924c-e5dc5987f925") }
MongoDB server version: 4.4.0
---
The server generated these startup warnings when booting: 
        2023-09-28T01:08:13.856+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
        2023-09-28T01:08:14.670+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
Stage_database  13.751GB
admin            0.000GB
config           0.000GB
local            0.000GB
> use Stage_database
switched to db Stage_database
> clear
uncaught exception: ReferenceError: clear is not defined :
@(shell):1:1
> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"walk"}).count()
0
> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"moped"}).count()
4359

C. Execute generic_metrics with moped data, and refresh localhost page: [Still showing old data] - The issue observed has been fixed below.

Executed the following:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T01:19:53.638888+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T01:19:54.449342+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:20:42.475587+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:20:49.531736+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:20:57.037617+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:04.865504+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:11.312305+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:20.019433+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:28.482808+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:36.455178+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:43.789357+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:50.982021+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:21:57.869427+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:22:06.187501+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:22:14.686177+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:22:23.371530+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:22:31.543634+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

Refreshing localhost, still gives the below chart:

image

As you can see in the above chart, the latest pie_chart depicting "Moped" mode is not being updated. The timestamp doesn't match with the one from the log.

D. The chart is updated - refreshed the page after ~30 mins Fix: Let the notebook execution complete till present time.

I travelled for half an hour or so, now it's been updated to show the below. Refreshed the page and now it's loading properly with "Moped" data as Others image

Execution error with generic_metrics_sensed notebook

Execution: ``` (emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics_sensed.ipynb default ```

Result:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T02:09:27.333781+00:00 with args Namespace(plot_notebook='generic_metrics_sensed.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T02:09:27.492882+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Traceback (most recent call last):
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 106, in <module>
    compute_for_date(None, None)
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 103, in compute_for_date
    nbclient.execute(new_nb)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1305, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 166, in wrapped
    return loop.run_until_complete(inner)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 705, in async_execute
    await self.async_execute_cell(
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1058, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 914, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
expanded_ct, file_suffix, quality_text, debug_df = scaffolding.load_viz_notebook_sensor_inference_data(year,
                                                                            month,
                                                                            program,
                                                                            include_test_users,
                                                                            sensed_algo_prefix)
------------------

----- stdout -----
Loaded all confirmed trips of length 57407
----- stdout -----
After filtering, found 57407 participant trips
----- stdout -----
Loaded expanded_ct with length 57407 for None
------------------

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 expanded_ct, file_suffix, quality_text, debug_df = scaffolding.load_viz_notebook_sensor_inference_data(year,
      2                                                                             month,
      3                                                                             program,
      4                                                                             include_test_users,
      5                                                                             sensed_algo_prefix)

File /usr/src/app/saved-notebooks/scaffolding.py:189, in load_viz_notebook_sensor_inference_data(year, month, program, include_test_users, sensed_algo_prefix)
    187 print(f"Loaded expanded_ct with length {len(expanded_ct)} for {tq}")
    188 if len(expanded_ct) > 0:
--> 189     expanded_ct["primary_mode_non_other"] = participant_ct_df.cleaned_section_summary.apply(lambda md: max(md["distance"], key=md["distance"].get))
    190     expanded_ct.primary_mode_non_other.replace({"ON_FOOT": "WALKING"}, inplace=True)
    191     valid_sensed_modes = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN"]

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/series.py:4771, in Series.apply(self, func, convert_dtype, args, **kwargs)
   4661 def apply(
   4662     self,
   4663     func: AggFuncType,
   (...)
   4666     **kwargs,
   4667 ) -> DataFrame | Series:
   4668     """
   4669     Invoke function on values of Series.
   4670 
   (...)
   4769     dtype: float64
   4770     """
-> 4771     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py:1123, in SeriesApply.apply(self)
   1120     return self.apply_str()
   1122 # self.f is Callable
-> 1123 return self.apply_standard()

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py:1174, in SeriesApply.apply_standard(self)
   1172     else:
   1173         values = obj.astype(object)._values
-> 1174         mapped = lib.map_infer(
   1175             values,
   1176             f,
   1177             convert=self.convert_dtype,
   1178         )
   1180 if len(mapped) and isinstance(mapped[0], ABCSeries):
   1181     # GH#43986 Need to do list(mapped) in order to get treated as nested
   1182     #  See also GH#25959 regarding EA support
   1183     return obj._constructor_expanddim(list(mapped), index=obj.index)

File ~/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/pandas/_libs/lib.pyx:2924, in pandas._libs.lib.map_infer()

File /usr/src/app/saved-notebooks/scaffolding.py:189, in load_viz_notebook_sensor_inference_data.<locals>.<lambda>(md)
    187 print(f"Loaded expanded_ct with length {len(expanded_ct)} for {tq}")
    188 if len(expanded_ct) > 0:
--> 189     expanded_ct["primary_mode_non_other"] = participant_ct_df.cleaned_section_summary.apply(lambda md: max(md["distance"], key=md["distance"].get))
    190     expanded_ct.primary_mode_non_other.replace({"ON_FOOT": "WALKING"}, inplace=True)
    191     valid_sensed_modes = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN"]

TypeError: 'float' object is not subscriptable

Execution of generic_timeseries

Executed the following: ``` (emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_timeseries.ipynb default ```

Result:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T02:13:18.254876+00:00 with args Namespace(plot_notebook='generic_timeseries.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T02:13:18.290145+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:13:37.510878+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:13:42.655497+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:13:47.867217+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:13:53.096391+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:13:59.360500+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:14:04.712781+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:14:11.055027+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:14:17.425826+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

Executed Mode Specific Metrics

Executed the following command: ``` (emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb defaul ```

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T02:22:55.108546+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T02:22:55.149004+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:23:11.085428+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:23:16.746364+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:23:22.396607+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:23:29.270322+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

Execution for mode_specific_timeseries notebook [Error]

Executed the following code:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_timeseries.ipynb default

Result:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_timeseries.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T02:29:17.211150+00:00 with args Namespace(plot_notebook='mode_specific_timeseries.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T02:29:17.241044+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:29:55.157467+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Traceback (most recent call last):
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 110, in <module>
    compute_for_date(month_year.month, month_year.year)
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 103, in compute_for_date
    nbclient.execute(new_nb)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1305, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 166, in wrapped
    return loop.run_until_complete(inner)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 705, in async_execute
    await self.async_execute_cell(
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1058, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/root/miniconda-23.1.0/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 914, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
quality_text = scaffolding.get_quality_text(expanded_ct, mode_counts_interest, mode_of_interest, include_test_users)
------------------


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 quality_text = scaffolding.get_quality_text(expanded_ct, mode_counts_interest, mode_of_interest, include_test_users)

NameError: name 'mode_counts_interest' is not defined

Executed energy_calculations notebook

Executed the following: ``` (emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py energy_calculations.ipynb default ```

Result:

(emission) root@61584fb3620f:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py energy_calculations.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T02:32:10.257561+00:00 with args Namespace(plot_notebook='energy_calculations.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T02:32:10.298417+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:32:24.569960+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T02:32:28.374697+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

@shankari
Copy link
Contributor Author

shankari commented Sep 28, 2023

wrt

Executed for generic_metrics_sensed notebook: [Gives Error]

this seems to be a dup of #93 (comment)

If you can check the database and verify that you see the same NaN pattern in the trips from the Vail dataset, I am comfortable ignoring those errors.

As you can see in the above chart, the latest pie_chart depicting "Moped" mode is not being updated. The timestamp doesn't match with the one from the log.

That is kind of bizarre. I don't see any error while running this

Dynamic labels are not available.
Running at 2023-09-28T01:19:53.638888+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T01:19:54.449342+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:20:42.475587+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

When the notebook is run with nbclient.execute, I believe it stores the outputs in the notebook.

  • You could try changing generate_plots to only run for the aggregate case (year=None, month=None) and then open the resulting notebook to see what is going on
  • you could also see if the errors are for the aggregate data only or also for individual months (and then compare the notebooks if only one of them works)

Once you have figured it out, of course, change generate_plots back and re-test.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

The chart not being updated is not an issue. I tried to refresh the page prior to the successful completion of the notebook script. Upon the completion of notebook script, things are working fine.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

/> When the notebook is run with nbclient.execute, I believe it stores the outputs in the notebook.

  • You could try changing generate_plots to only run for the aggregate case (year=None, month=None) and then open the resulting notebook to see what is going on
  • you could also see if the errors are for the aggregate data only or also for individual months (and then compare the notebooks if only one of them works)

Once you have figured it out, of course, change generate_plots back and re-test.

The notebook execution is not failing. So, I don't think it's required to re-execute the notebook. I will skip it, and proceeding with the generic_metrics_sensed issue.
I got the following observation:

Steps followed:
Added a debug statement in scaffolding.py - inside load_viz_notebook_data to to look up for the NaN pattern.

Hello: participant_ct_df.cleaned_section_summary.tail()57402    {'distance': {'ON_FOOT': 886.4937093667857}, '...
57403    {'distance': {'ON_FOOT': 610.2234223038181}, '...
57404    {'distance': {'ON_FOOT': 405.97685486691756}, ...
57405    {'distance': {'IN_VEHICLE': 4230.990793080315,...
57406    {'distance': {'IN_VEHICLE': 4255.784960891164,...
Name: cleaned_section_summary, dtype: object

This seems different.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

Testing done so far:
Scenario I: moped - with default mapping
Dataset used: vail_2022-05-09.tar.gz
STUDY_CONFIG=dev-emulator-program

Note: I am using Safari to load the localhost and see changes.

Execution of generic_metrics [Some issues]

  1. Made the following changes
ashrest2-35384s:em-public-dashboard ashrest2$ git diff docker-compose.dev.yml
diff --git a/docker-compose.dev.yml b/docker-compose.dev.yml
index fdbdbb8..c2a4342 100644
--- a/docker-compose.dev.yml
+++ b/docker-compose.dev.yml
@@ -26,7 +26,7 @@ services:
       - DB_HOST=db
       - WEB_SERVER_HOST=0.0.0.0
       - CRON_MODE=
-      - STUDY_CONFIG=stage-program
+      - STUDY_CONFIG=dev-emulator-program
     ports:
       # ipynb in numbers
       - "47962:8888"

Re-launched the docker-compose.dev.yml up
Executed the following:

(emission) root@183bf369df9a:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default

Results:

/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/dev-emulator-program.nrel-op.json
Successfully downloaded config with version 1 for Development environment (program) and data collection URL default
Dynamic labels download was successful.
Running at 2023-09-28T03:19:26.899624+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T03:19:26.938714+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={'MODE': [{'value': 'walk', 'baseMode': 'WALKING', 'met_equivalent': 'WALKING', 'kgCo2PerKm': 0}, {'value': 'e-bike', 'baseMode': 'E_BIKE', 'met': {'ALL': {'range': [0, -1], 'mets': 4.9}}, 'kgCo2PerKm': 0.00728}, {'value': 'bike', 'baseMode': 'BICYCLING', 'met_equivalent': 'BICYCLING', 'kgCo2PerKm': 0}, {'value': 'bikeshare', 'baseMode': 'BICYCLING', 'met_equivalent': 'BICYCLING', 'kgCo2PerKm': 0}, {'value': 'scootershare', 'baseMode': 'E_SCOOTER', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.00894}, {'value': 'drove_alone', 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.22031}, {'value': 'shared_ride', 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.11015}, {'value': 'e_car_drove_alone', 'baseMode': 'E_CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.08216}, {'value': 'e_car_shared_ride', 'baseMode': 'E_CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.04108}, {'value': 'moped', 'baseMode': 'MOPED', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.05555}, {'value': 'taxi', 'baseMode': 'TAXI', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.30741}, {'value': 'bus', 'baseMode': 'BUS', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.20727}, {'value': 'train', 'baseMode': 'TRAIN', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.12256}, {'value': 'free_shuttle', 'baseMode': 'BUS', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.20727}, {'value': 'air', 'baseMode': 'AIR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.09975}, {'value': 'not_a_trip', 'baseMode': 'UNKNOWN', 'met_equivalent': 'UNKNOWN', 'kgCo2PerKm': 0}, {'value': 'other', 'baseMode': 'OTHER', 'met_equivalent': 'UNKNOWN', 'kgCo2PerKm': 0}], 'PURPOSE': [{'value': 'home'}, {'value': 'work'}, {'value': 'at_work'}, {'value': 'school'}, {'value': 'transit_transfer'}, {'value': 'shopping'}, {'value': 'meal'}, {'value': 'pick_drop_person'}, {'value': 'pick_drop_item'}, {'value': 'personal_med'}, {'value': 'access_recreation'}, {'value': 'exercise'}, {'value': 'entertainment'}, {'value': 'religious'}, {'value': 'other'}], 'REPLACED_MODE': [{'value': 'no_travel'}, {'value': 'walk'}, {'value': 'bike'}, {'value': 'bikeshare'}, {'value': 'scootershare'}, {'value': 'drove_alone'}, {'value': 'shared_ride'}, {'value': 'e_car_drove_alone'}, {'value': 'e_car_shared_ride'}, {'value': 'taxi'}, {'value': 'bus'}, {'value': 'train'}, {'value': 'free_shuttle'}, {'value': 'other'}], 'translations': {'en': {'walk': 'Walk', 'e-bike': 'E-bike', 'bike': 'Regular Bike', 'bikeshare': 'Bikeshare', 'scootershare': 'Scooter share', 'drove_alone': 'Gas Car Drove Alone', 'shared_ride': 'Gas Car Shared Ride', 'e_car_drove_alone': 'E-Car Drove Alone', 'e_car_shared_ride': 'E-Car Shared Ride', 'moped': 'Moped', 'taxi': 'Taxi/Uber/Lyft', 'bus': 'Bus', 'train': 'Train', 'free_shuttle': 'Free Shuttle', 'air': 'Air', 'not_a_trip': 'Not a trip', 'no_travel': 'No travel', 'home': 'Home', 'work': 'To Work', 'at_work': 'At Work', 'school': 'School', 'transit_transfer': 'Transit transfer', 'shopping': 'Shopping', 'meal': 'Meal', 'pick_drop_person': 'Pick-up/ Drop off Person', 'pick_drop_item': 'Pick-up/ Drop off Item', 'personal_med': 'Personal/ Medical', 'access_recreation': 'Access Recreation', 'exercise': 'Recreation/ Exercise', 'entertainment': 'Entertainment/ Social', 'religious': 'Religious', 'other': 'Other'}, 'es': {'walk': 'Caminando', 'e-bike': 'e-bicicleta', 'bike': 'Bicicleta', 'bikeshare': 'Bicicleta compartida', 'scootershare': 'Motoneta compartida', 'drove_alone': 'Coche de Gas, Condujo solo', 'shared_ride': 'Coche de Gas, Condujo con otros', 'e_car_drove_alone': 'e-coche, Condujo solo', 'e_car_shared_ride': 'e-coche, Condujo con ontras', 'moped': 'Ciclomotor', 'taxi': 'Taxi/Uber/Lyft', 'bus': 'Autobús', 'train': 'Tren', 'free_shuttle': 'Colectivo gratuito', 'air': 'Avión', 'not_a_trip': 'No es un viaje', 'no_travel': 'No viajar', 'home': 'Inicio', 'work': 'Trabajo', 'at_work': 'En el trabajo', 'school': 'Escuela', 'transit_transfer': 'Transbordo', 'shopping': 'Compras', 'meal': 'Comida', 'pick_drop_person': 'Recoger/ Entregar Individuo', 'pick_drop_item': 'Recoger/ Entregar Objeto', 'personal_med': 'Personal/ Médico', 'access_recreation': 'Acceder a Recreación', 'exercise': 'Recreación/ Ejercicio', 'entertainment': 'Entretenimiento/ Social', 'religious': 'Religioso', 'other': 'Otros'}}})]

Waited for the completion of the notebook script execution.

There was an already open localhost webpage, reloaded it. (left)
Opened the localhost webpage from public-dashboard page. (right)
Screenshot 2023-09-27 at 8 31 20 PM

Re-checked the database:

> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"walk"}).count()
0
> db.Stage_analysis_timeseries.find({"data.user_input.mode_confirm":"moped"}).count()
4359

I will try to execute the steps mentioned here to see if this is causing it.

That is kind of bizarre. I don't see any error while running this

Dynamic labels are not available.
Running at 2023-09-28T01:19:53.638888+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T01:19:54.449342+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T01:20:42.475587+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

When the notebook is run with nbclient.execute, I believe it stores the outputs in the notebook.

  • You could try changing generate_plots to only run for the aggregate case (year=None, month=None) and then open the resulting notebook to see what is going on
  • you could also see if the errors are for the aggregate data only or also for individual months (and then compare the notebooks if only one of them works)

Once you have figured it out, of course, change generate_plots back and re-test.

Re-executing the above steps in Mozilla to see if it's browser specific issue.
Tested the above scenario, this time around by using STUDY_CONFIG=stage-program.
And re-executing for the generic_metrics.
Still the same issue. Upon refreshing the page, the charts are not being re-loaded. But when I go into the public-dashboard page, and launch the localhost - the new changes are reflected.

@shankari
Copy link
Contributor Author

shankari commented Sep 28, 2023

There was an already open localhost webpage, reloaded it. (left)
Opened the localhost webpage from public-dashboard page. (right)

@iantei So the issue that you are highlighting is that reloading does not actually reload the metrics, but opening a new page does?

Couple of high level comments:

  1. it frequently happens that reloads take some time to actually refresh. I would do the following:
    1. verify using timestamps and put the verification results into the issue (i.e. compare file timestamps with script output timestamps)
    2. open the developer zone, go to the network tab and refresh. Make sure that the return type is 200 instead of one of the cached status codes
  2. note that on the right, the "trip miles per mode" has not been updated even in the newly opened browser. Please make sure to verify multiple metrics instead of focusing only on "Number of trips"

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

  1. i. Timestamp in this case:
image Last updated 2023-09-28T03:19:43 (from the chart in browser) Last updated 2023-09-28T03:56:45 (from the terminal log) Last successful execution of generic_metrics in terminal log:

(emission) root@24be9fa35678:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb 
(emission) root@24be9fa35678:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T03:56:45.000945+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T03:56:45.042602+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:09.580454+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:16.480758+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:23.666826+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:30.957319+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]

ii.
The return type is 200, 404 for sensed ones.
image

  1. I will make a note of all metrics going forward.

@shankari
Copy link
Contributor Author

shankari commented Sep 28, 2023

I will make a note of all metrics going forward.

@iantei not just going forward. There is no evidence here that the miles graph is updated, even with the new browser tab (right image)

I would like to see a record of the miles_mode_confirm (and the other metrics) images with recent timestamps as well

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

image image image

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

Here's the command I ran along with the output.

(emission) root@24be9fa35678:/usr/src/app/saved-notebooks# PYTHONPATH=..python bin/update_mappings.py mapping_dictionaries.ipynb 
bash: bin/update_mappings.py: Permission denied
(emission) root@24be9fa35678:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb 
(emission) root@24be9fa35678:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
Dynamic labels are not available.
Running at 2023-09-28T03:56:45.000945+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2020-09-01T00:00:00+00:00]>, <Arrow [2023-09-01T00:00:00+00:00]>)
Running at 2023-09-28T03:56:45.042602+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:09.580454+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:16.480758+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:23.666826+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:30.957319+00:00 with params [Parameter('year', int, value=2020), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:37.393099+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:46.062188+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:57:53.451332+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:00.888240+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:07.778203+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:14.674005+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:21.531039+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:29.221692+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:36.694908+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:43.982052+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:51.249523+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:58:58.671300+00:00 with params [Parameter('year', int, value=2021), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:06.116057+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:13.464401+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:20.682934+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:27.967198+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:35.532452+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:41.757946+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:49.181585+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T03:59:56.448044+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:03.772014+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:11.061260+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:18.367641+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:25.616975+00:00 with params [Parameter('year', int, value=2022), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:33.079061+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:40.411851+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:47.713259+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:00:55.111689+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:01:02.415842+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:01:09.826848+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:01:17.238615+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:01:24.568172+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
Running at 2023-09-28T04:01:31.888439+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=True), Parameter('dynamic_labels', dict, value={})]
(emission) root@24be9fa35678:/usr/src/app/saved-notebooks#

@shankari
Copy link
Contributor Author

@iantei Testing of the non-sensed notebooks seems to be fine. However, I do not see the reason for the notebook with the sensed metrics to fail. Once we have established that reason, I am happy to merge.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

Proceeding with the generic_metrics_sensed issue.
I got the following observation:

Steps followed:
Added a debug statement in scaffolding.py - inside load_viz_notebook_data to look up for the NaN pattern.

Hello: participant_ct_df.cleaned_section_summary.tail()57402    {'distance': {'ON_FOOT': 886.4937093667857}, '...

Result:

57403    {'distance': {'ON_FOOT': 610.2234223038181}, '...
57404    {'distance': {'ON_FOOT': 405.97685486691756}, ...
57405    {'distance': {'IN_VEHICLE': 4230.990793080315,...
57406    {'distance': {'IN_VEHICLE': 4255.784960891164,...
Name: cleaned_section_summary, dtype: object

This seems different.

@iantei
Copy link
Contributor

iantei commented Sep 28, 2023

There're three notable issues with this change right now.

  1. Related with the page refresh one
  2. Related with generic_metric_sensed
  3. Related with mode_specific_timeseries

@shankari
Copy link
Contributor Author

Added a debug statement in scaffolding.py - inside load_viz_notebook_data to look up for the NaN pattern.

Where is the patch showing the statement you added? This does not look like python code to me

Hello: participant_ct_df.cleaned_section_summary.tail()57402    {'distance': {'ON_FOOT': 886.4937093667857}, '...

Also, it is not clear that tail() is the right option. The NaN can occur anywhere. We are doing .apply() on the full dataframe. You need to use something like isnan over the dataframe.

@shankari
Copy link
Contributor Author

wrt #89 (comment)

  1. Related with the page refresh one: seems to be resolved with Support custom label dropdowns from the dynamic config #89 (comment) and Support custom label dropdowns from the dynamic config #89 (comment). The metrics are being recomputed properly, we can't fix browser caching.
  2. Related with generic_metric_sensed: this still needs investigation
  3. Related with mode_specific_timeseries: I had missed this! It also still needs investigation.

@shankari
Copy link
Contributor Author

Most likely, due to the computation difference of energies with different mapping, this different illustration is observed.

Obviously, the difference is due to the different mappings. But what is the difference and why does it cause the values to be different? We have to understand it and make sure that it is not an error. When we had the "other" trips completely dropped from (3), it was also due to a difference in the mapping, but it was an error.

most of the values here are the same - the only difference is in "Gas Car, with others" and "No Travel". Please investigate the cause of the difference between them.

@shankari
Copy link
Contributor Author

@iantei for #89 (comment), which dynamic config did you use, and how did you use it?

@iantei
Copy link
Contributor

iantei commented Sep 29, 2023

The dynamic config mentioned on #91 (comment)
Attached the url for the used dynamic config:
example-study-label-options

@shankari
Copy link
Contributor Author

shankari commented Sep 29, 2023

@iantei

  1. No "No Travel": So there's a pretty obvious issue with using that - the example-study-label-options is intended for use with a study so doesn't have the replaced mode. Note that in mode_labels.csv, there is only one set of mode key -> mode value mappings. But in the dynamic config labels, there are separate mappings for mode and replaced mode because we show them to users and there is no point in showing "No Travel" as a mode to a user, for example. You should use the replaced mode mapping to map the replaced mode values and use the program for testing.
  2. Carpool is zero: you are continuing to use the energy_intensity.csv as the mapping. That uses the user visible value as the key. There is is a row for "Gas Car, with others" but not one for "Gas Car Shared Ride", which is why it ends up with zero. But that is an error. Although the modes in the dynamic config are currently almost similar to the standard modes, there is no guarantee that they will be the same or even remotely similar. That's why the dynamic config has the kgCO2PerKm value encoded as part of the config. For the dynamic config, you need to use those values instead of the values from energy_intensity.csv.

This was highlighted while creating the issue:

However, this change has not yet been implemented in the public dashboard. The public dashboard still reads the user-specific mappings and the CO2 and energy equivalents from files hardcoded in this repo (notably in viz_scripts/auxiliary_files).

The fact that the current examples are similar to the default values may be confusing you. You have to understand the underlying concepts to make sure that you build a solution that works properly.

As a concrete example, the CA e-bike rebate program is planning to support the following modes. Keep that in mind while designing and testing your solution.

Walk
Auto Driver
Auto Passenger
E-bike
Regular bike
Taxi/Uber/Lyft
Bike/Scooter-share
Bus/Train/Shuttle
Other
ERROR: Not a real trip

@iantei
Copy link
Contributor

iantei commented Oct 3, 2023

In the current implementation, I am using translations.en key values create a mapping with mode_confirm and replaced_mode to form Mode_confirm and Replaced_mode respectively.

The reason behind No 'No Travel' is because there is a different in translations.en block between example-study-label-options and example-program-label-options. example-study-label-options.json does not have "no_travel": "No travel". Therefore, when I chose to test with example-study-label-options, I wasn't able to see "No travel".
Choosing the dynamic_config from example-program-label-options show "No travel".
image
Fig: Energy_Impact chart vs Replaced mode chart with example-program-label-options

Code to showcase the mapping

```

# Extract translations key
dic_translations = dict()

if "translations" in dynamic_labels and "en" in dynamic_labels["translations"]:
    dic_translations = dynamic_labels["translations"]["en"]
    dic_translations = defaultdict(lambda: 'Other', dic_translations)

# Select the mapping based on availability of dynamic_labels
if dic_translations:
    dic_mapping = dic_translations
else:
    dic_mapping = dic_re

# Map new mode labels with translations dictionary from dynamic_labels
# CASE 2 of https://github.com/e-mission/em-public-dashboard/issues/69#issuecomment-1256835867
if "mode_confirm" in expanded_ct.columns:
    expanded_ct['Mode_confirm'] = expanded_ct['mode_confirm'].map(dic_mapping)
if study_type == 'program':
    # CASE 2 of https://github.com/e-mission/em-public-dashboard/issues/69#issuecomment-1256835867
    if 'replaced_mode' in expanded_ct.columns:
        expanded_ct['Replaced_mode'] = expanded_ct['replaced_mode'].map(dic_mapping)
    else:
        print("This is a program, but no replaced modes found. Likely cold start case. Ignoring replaced mode mapping")
else:
        print("This is a study, not expecting any replaced modes.")

</p>
</details>

@iantei
Copy link
Contributor

iantei commented Oct 3, 2023

  1. Carpool is zero: you are continuing to use the energy_intensity.csv as the mapping. That uses the user visible value as the key. There is is a row for "Gas Car, with others" but not one for "Gas Car Shared Ride", which is why it ends up with zero. But that is an error. Although the modes in the dynamic config are currently almost similar to the standard modes, there is no guarantee that they will be the same or even remotely similar. That's why the dynamic config has the kgCO2PerKm value encoded as part of the config. For the dynamic config, you need to use those values instead of the values from energy_intensity.csv.

Yes, I am still using energy_intensity.csv as the mapping which needs to be fixed.
However, the above figure is the representation for Energy_Impact(kWH) chart rather than CO2 Emission. There's a kgCO2PerKm value provided in the dynamic_config, however there is not mapping for the mode to fuel present, which is utilized as dic_fuel for the calculations involved with Energy_Impact(kWH) chart. Should we associate the fuel type into the specific modes in dynamic_config too?
Moreover, I only see the kgCO2PerKm value encoded for MODE only, and not for REPLACED_MODE on the dynamic_config example-program-label-options.

@shankari
Copy link
Contributor Author

shankari commented Oct 3, 2023

At a high level, you are correct that energy is more complicated since we don't compute it on the phone (currently) and so it is not part of the dynamic config. As a first step, I would suggest two things to get this change done before we expand with energy etc

  1. if we have a dynamic label, use the kgCO2PerKm directly instead of energy x unit emissions for the computation
  2. we will not have correct values for the energy computations in the public dashboard then, for dynamic labels. Fortunately, what we display by default is the emissions impact. So we should change the frontend (for the public dashboard) to skip the energy computation metrics from the dropdown if it is a dynamic label.

What if the dynamic labels are incorrect?

  1. you shouldn't use the example dynamic labels as the dynamic labels. Every partner can choose to use a different set of dynamic labels. That's what the dynamic label feature enables. So you can use them as an example, but you should identify the higher level problem, which is the one above.
  2. In that case, it is fine if the output is incorrect. GIGO. But if the dynamic labels are correct then the graphs must be correct.

For context on how we will/can expand this going forward:

@iantei
Copy link
Contributor

iantei commented Oct 6, 2023

We use some computation values present on energy_intensity.csv to compute the CO2 emission. In brief, we compute the energy_intensity_factor * CO2_factor for mode_replaced and confirm_mode for each, this results in CO2 emission per trip (which can likely be per km or per mile). So, multiplying the distance covered to the CO2 emission per trip results in the total CO2 emissions representation.

But when we have dynamic label, we would like to make use of kgCO2PerKm values for each of the commute mode instead of calculating as described above. The kgCO2PerKm is provided in dynamic label.

While using this kgCO2PerKm value, the CO2 Emissions came a little different than using default mapping for same Replaced mode in Sketch of CO2 emissions impact by trips of specified mode [e-bike].
Further investigating, computing Mode_confirm_kg_CO2 and Replaced_mode_kg_CO2 values for different modes of commute (Bus, Free Shuttle, E-Bike and others) for both computations were slightly different in case of using default mapping vs dynamic config.
Therefore, this difference x distance (in km) resulted in the different value for representation in the chart shown below.

Default_Mapping Dynamic_Config

@iantei
Copy link
Contributor

iantei commented Oct 6, 2023

In the generic Timeseries notebook, we're displaying emission charts which visualized the units as lb, miles. Would we like to make alignment to these metrics to be displayed in kg, km units. We're computing in kg, km units for cases when dynamic labels are available.

@iantei
Copy link
Contributor

iantei commented Oct 6, 2023

We're mapping mode_confirm to Mode_confirm using dynamic label 's translations.en mapping.
When I printed out the unique values present on mode_confirm, I see the following, using the code below:

expanded_ct.mode_confirm = expanded_ct.mode_confirm.astype(str)
unique_values = np.unique(expanded_ct.mode_confirm)
print("print all unique values of mode_confirm")
print(list(unique_values))
print all unique values of mode_confirm
['air', 'airport_shuttle', 'bike', 'bike_& walk', 'bikeshare', 'bus', 'car', 'car-fly', 'cart', 'combination_football game, dinner, drove friend home', 'dropped_my truck at the repair shop and biked to work.', 'drove_alone', 'e-bike', 'electric_car', 'electric_vehicle', 'fly-drive-bike-drive-fly', 'free_shuttle', 'gondola', 'houseboat', 'jog', 'kayak', 'motorboat', 'multi-modal', 'multi-modal_car- regular bike- car', 'multi-modal_drive to bike to drive', 'multi_modal', 'multiple_modes', 'nan', 'not_a_trip', 'pilot_ebike', 'pontoon_boat', 'ride', 'run', 'running', 'sailboat', 'scooter_(non-motorized, non-share)', 'scootershare', 'shared_ride', 'sharedride_and walk', 'sharedrideandwalk', 'skate', 'ski', 'ski/car_drove alone', 'skiing', 'snowboarding', 'subway', 'taxi', 'time_at work', 'tow_truck', 'train', 'tram', 'tramway', 'u-haul_truck', 'walk', 'walk_& bike', 'walk_+ drove alone', 'work']

I do see only the following listing present on the MODE in example dynamic label:

walk, e-bike, bike, bikeshare, scootershare, drove_alone, shared_ride, e_car_drove_alone, e_car_shared_ride, moped, taxi, bus, train, free_shuttle, air, not_a_trip, other.

There's a few notable values like work amongst others also available with mode_confirm. This would be mapped into Mode_confirm as "To Work".

Should not work be categorized under purpose_confirm over mode_confirm?

@shankari
Copy link
Contributor Author

shankari commented Oct 6, 2023

Should not work be categorized under purpose_confirm over mode_confirm?

Good catch. Yes, work should be purpose_confirm - clearly some users have made errors in labeling. How do you propose to fix it?

@iantei
Copy link
Contributor

iantei commented Oct 6, 2023

We're using the mode_confirm, replaced_mode, purpose_confirm to map into Mode_confirm, Replaced_mode and Trip_purpose respectively using dic_re and dic_pur.
In the case, we have work categorized as mode_confirm, it would get Mode_confirm as Other because of the existing dic_re = defaultdict(lambda: 'Other',dic_re) mapping. So, the error labels like work would be categorized under Other Mode_confirm.

We're retrieving the user labeling from user_label - from analysis/confirmed_trip. Do we want to rectify this improper labeling altogether?

For the future gatekeeping, we can provide the user with a list of option to select from [like dropdown menu]. But if we're providing custom label options, it'd be difficult to check whether the purpose_confirm is a correct label or not.

Likewise for the dynamic_config, re-designing the mapping between different labels by not using mode_confirm and replaced_mode. Rather, map from MODE to translations.en's key to value for mode_Confirm and accordingly for PURPOSE and REPLACE_MODE too is an option. But this will only be error-proof accounting the dynamic labels are properly updated. So, this re-design might work for the present example dynamic label file, but might not be a proper solution.

@shankari
Copy link
Contributor Author

shankari commented Oct 6, 2023

In the case, we have work categorized as mode_confirm, it would get Mode_confirm as Other because of the existing dic_re = defaultdict(lambda: 'Other',dic_re) mapping. So, the error labels like work would be categorized under Other Mode_confirm.

Yes, in the default mapping. That is the desired behavior. But not in the dynamic mapping by default because it is not split between mode and purpose.

We're retrieving the user labeling from user_label - from analysis/confirmed_trip. Do we want to rectify this improper labeling altogether?

I don't understand what you mean by this.

For the future gatekeeping, we can provide the user with a list of option to select from [like dropdown menu]. But if we're providing custom label options, it'd be difficult to check whether the purpose_confirm is a correct label or not.

I don't know what you mean by this - we already do provide the user with a list of options to select from

But this will only be error-proof accounting the dynamic labels are properly updated. So, this re-design might work for the present example dynamic label file, but might not be a proper solution.

I am not sure what you mean by this

@iantei
Copy link
Contributor

iantei commented Oct 6, 2023

My understanding is work under mode_confirm is incorrect. It should only come under purpose_confirm, and not inside mode_confirm.

We are fetching all this information related to confirmed trip from analysis/confirmed_trip

def load_all_confirmed_trips(tq):
    agg = esta.TimeSeries.get_aggregate_time_series()
    all_ct = agg.get_data_df("analysis/confirmed_trip", tq)
    print("Loaded all confirmed trips of length %s" % len(all_ct))
    disp.display(all_ct.head())
    return all_ct

Categorizing labels, which are not present on dic_re, but are actual mode_confirm label as Other would be right, but not in this case where the mode_confirm is work, right?

I don't know what you mean by this - we already do provide the user with a list of options to select from

If we already provide the user with a list of options to select from, how did the mislabeling happened?

What happens if there is wrong labels available on MODE, PURPOSE, REPLACED_MODE? Is there a way to guarantee there would not be mislabeling in the future?

@shankari
Copy link
Contributor Author

shankari commented Oct 6, 2023

If we already provide the user with a list of options to select from, how did the mislabeling happened?

Because they selected "other" and typed in whatever they wanted

Categorizing labels, which are not present on dic_re, but are actual mode_confirm label as Other would be right, but not in this case where the mode_confirm is work, right?

Why? We have a list of valid mode labels. They entered something else. That is an "other" label.

What happens if there is wrong labels available on MODE, PURPOSE, REPLACED_MODE? Is there a way to guarantee there would not be mislabeling in the future?

No. We cannot and will not guarantee that there is no mislabeling. Any mislabeled trips will just go into OTHER. That is the expected output.

@iantei
Copy link
Contributor

iantei commented Oct 7, 2023

Good catch. Yes, work should be purpose_confirm - clearly some users have made errors in labeling. How do you propose to fix it?

For the dynamic mapping:
Instead of mapping in this way

    if "translations" in dynamic_labels and "en" in dynamic_labels["translations"]:
        dic_translations = dynamic_labels["translations"]["en"]
        dic_translations = defaultdict(lambda: 'Other', dic_translations)

    # Select the mapping based on availability of dynamic_labels
    if dic_translations:
        dic_mapping = dic_translations
    else:
        dic_mapping = dic_re

I would like to propose the following:
Create a dictionary that maps the dynamic_labels.MODE.value to their corresponding translations in translations.en's value. This way, we would only be using dynamic labels and not rely on existing mode_confirm, replaced_mode and purpose_confirm to create Mode_confirm, Replaced_mode and Purpose_confirm.
Similarly, different mapping would be created for REPLACED_MODE and PURPOSE.

This way, we would not have wrong categories for different mode and purpose.

@iantei
Copy link
Contributor

iantei commented Oct 13, 2023

Testing done on generic_timeseries notebook.

I tested this with default mapping, prior to my changes and observed this behavior.
This still uses the default mapping. I chose the Year as 2022 and Month as 5.

We're showing all the labels associated with Confirmed Mode on the Legends' indices, while only showing 5 distinct modes on the timeseries graph.
Moreover, even though on the chart title, we mention "excluding 'Other' and 'Not a trip'", but we still list them on the label indices. Would it be a good idea to just showcase the lines which portray the modes in the list of confirmed mode?

Screenshot 2023-10-12 at 9 52 55 PM

@iantei
Copy link
Contributor

iantei commented Oct 24, 2023

I have used kg and kms in computation of emission and energy computation instead of using lbs and miles (which was used previously), which resulted into changing all the displayed values for chart belonging to generic_timeseries and energy_calculations.
However the generic_metrics, mode_specific_metrics and other notebook still displays charts and values with respect tot lbs and miles. Should I make changes to these notebooks as well to reflect the use of kg/km over lbs/miles in different charts?

@iantei iantei moved this from Issues being worked on to Ready for review in OpenPATH Tasks Overview Oct 27, 2023
@shankari
Copy link
Contributor Author

Moreover, even though on the chart title, we mention "excluding 'Other' and 'Not a trip'", but we still list them on the label indices. Would it be a good idea to just showcase the lines which portray the modes in the list of confirmed mode?

Sure. But it is not clear why "Not a Trip" and "Other" are showing up in the legend in the first place.
The legend is autogenerated

plt.legend(bbox_to_anchor=(1.02, 1), loc='best', borderaxespad=0, title=legend_title)

so figuring out the reason is likely to need matplotlib skills. It is not wrong (both Not a trip and Other are zero), so I would file a new issue and clean it up later rather than holding up this change further.

However the generic_metrics, mode_specific_metrics and other notebook still displays charts and values with respect tot lbs and miles. Should I make changes to these notebooks as well to reflect the use of kg/km over lbs/miles in different charts?

I think we have discussed this already. The long-term solution is to use the imperial value from the dynamic config.
For the short-term, we can leave it unchanged. Please file another cleanup issue to handle it.

@shankari shankari moved this from Ready for review to Issues being worked on in OpenPATH Tasks Overview Oct 27, 2023
@iantei iantei moved this from Issues being worked on to Ready for review in OpenPATH Tasks Overview Oct 27, 2023
@shankari shankari moved this from Ready for review to Issues being worked on in OpenPATH Tasks Overview Oct 28, 2023
@shankari
Copy link
Contributor Author

Changes have been pushed to production

@github-project-automation github-project-automation bot moved this from Issues being worked on to Tasks completed in OpenPATH Tasks Overview Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants