Skip to content

Commit

Permalink
Merge pull request #69 from WenjieDu/dev
Browse files Browse the repository at this point in the history
Change the default data_home_path and update the docs
  • Loading branch information
WenjieDu authored Jun 28, 2024
2 parents bd2fe56 + 4dd36e9 commit a57f9e6
Show file tree
Hide file tree
Showing 5 changed files with 36 additions and 26 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,8 @@ tsdb.delete_cache(dataset_name='physionet_2012')
# or you can delete all cache with delete_cached_data() to free disk space
tsdb.delete_cache()

# to avoid taking up too much space if downloading many datasets,
# The default cache directory is ~/.pypots/tsdb under the user's home directory.
# To avoid taking up too much space if downloading many datasets ,
# TSDB cache directory can be migrated to an external disk
tsdb.migrate_cache("/mnt/external_disk/TSDB_cache")
```
Expand Down Expand Up @@ -145,9 +146,9 @@ year={2023},
}
```
or
> Wenjie Du. (2023).
> Wenjie Du.
> PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series.
> arXiv, abs/2305.18811. https://arxiv.org/abs/2305.18811
> arXiv, abs/2305.18811, 2023.


Expand Down
25 changes: 13 additions & 12 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Welcome to TSDB documentation!
.. image:: https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FWenjieDu%2FTime_Series_Database&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=Visits+since+April+2022&edge_flat=false
:alt: Visit num

📣 TSDB now supports a total of 1️⃣6️⃣9️⃣ time-series datasets ‼️
📣 TSDB now supports a total of 1️⃣7️⃣2️⃣ time-series datasets ‼️

.. image:: https://pypots.com/figs/pypots_logos/PyPOTS/logo_FFBG.svg
:width: 120
Expand Down Expand Up @@ -100,12 +100,15 @@ or install from source code:
tsdb.download_and_extract('physionet_2012', './save_it_here')
# datasets you once loaded are cached, and you can check them with list_cached_data()
tsdb.list_cache()
# you can delete only one specific dataset and preserve others
# you can delete only one specific dataset's pickled cache
tsdb.delete_cache(dataset_name='physionet_2012', only_pickle=True)
# you can delete only one specific dataset raw files and preserve others
tsdb.delete_cache(dataset_name='physionet_2012')
# or you can delete all cache with delete_cached_data() to free disk space
tsdb.delete_cache()
# to avoid taking up too much space if downloading many datasets,
# The default cache directory is ~/.pypots/tsdb under the user's home directory.
# To avoid taking up too much space if downloading many datasets ,
# TSDB cache directory can be migrated to an external disk
tsdb.migrate_cache("/mnt/external_disk/TSDB_cache")
Expand All @@ -124,6 +127,8 @@ That's all. Simple and efficient. Enjoy it! 😃
`Electricity Load Diagrams <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/electricity_load_diagrams>`_ :cite:`trindade2015electricity` Forecasting, Imputation
`Electricity Transformer Temperature (ETT) <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/electricity_transformer_temperature>`_ :cite:`zhou2021informer` Forecasting, Imputation
`Vessel AIS data <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/vessel_ais>`_ :cite:`grgicevic2023ais` Forecasting, Imputation, Classification
`PeMS Traffic <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/pems_traffic>`_ Forecasting, Imputation
`Solar Alabama <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/solar_alabama>`_ Forecasting, Imputation
`UCR & UEA Datasets <https://github.com/WenjieDu/TSDB/tree/main/dataset_profiles/ucr_uea_datasets>`_ (all 163 datasets) :cite:`bagnall2018uea` :cite:`dau2018ucr` Classification
========================================================================================================================================================================== ==========================================

Expand All @@ -145,22 +150,18 @@ please cite it as below and 🌟star `PyPOTS repository <https://github.com/Wenj
.. code-block:: bibtex
:linenos:
@article{du2023PyPOTS,
title={{PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series}},
@article{du2023pypots,
title={{PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series}},
author={Wenjie Du},
journal={arXiv preprint arXiv:2305.18811},
year={2023},
eprint={2305.18811},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2305.18811},
doi={10.48550/arXiv.2305.18811},
}
or

Wenjie Du. (2023).
Wenjie Du.
PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series.
arXiv, abs/2305.18811. https://doi.org/10.48550/arXiv.2305.18811
arXiv, abs/2305.18811, 2023.


.. toctree::
Expand Down
2 changes: 1 addition & 1 deletion tsdb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
#
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
__version__ = "0.4"
__version__ = "0.5"

from .data_processing import (
CACHED_DATASET_DIR,
Expand Down
2 changes: 1 addition & 1 deletion tsdb/config.ini
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
[path]
data_home = ~/.tsdb
data_home = ~/.pypots/tsdb
26 changes: 17 additions & 9 deletions tsdb/utils/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,25 +103,33 @@ def determine_data_home():
data_home_path = config.get("path", "data_home")
# replace '~' with the absolute path if existing in the path
data_home_path = data_home_path.replace("~", os.path.expanduser("~"))
old_cached_dataset_dir = os.path.join(

# old cached dataset dir path used in TSDB v0.2
old_cached_dataset_dir_02 = os.path.join(
os.path.expanduser("~"), ".tsdb_cached_datasets"
)
# old cached dataset dir path used in TSDB v0.4
old_cached_dataset_dir_04 = os.path.join(os.path.expanduser("~"), ".tsdb")

if os.path.exists(old_cached_dataset_dir):
# use the old path and warn the user
if os.path.exists(old_cached_dataset_dir_02) or os.path.exists(
old_cached_dataset_dir_04
):
logger.warning(
"‼️ Detected the home dir of the old version TSDB. "
"Since v0.3, TSDB has changed the default cache dir to '~/.tsdb'. "
"Auto migrating downloaded datasets to the new path. "
"‼️ Detected the home dir of the old version TSDB. Auto migrating... Please wait."
)
cached_dataset_dir = data_home_path
migrate(old_cached_dataset_dir, cached_dataset_dir)
if os.path.exists(old_cached_dataset_dir_02):
migrate(old_cached_dataset_dir_02, cached_dataset_dir)
else:
migrate(old_cached_dataset_dir_04, cached_dataset_dir)
logger.info("🌟 Migrating finished.")
elif os.path.exists(data_home_path):
# use the path directly, may be in a portable disk
cached_dataset_dir = data_home_path
else:
# use the default path
default_path = os.path.join(os.path.expanduser("~"), ".tsdb")
# use the default path for initialization,
# e.g. `data_home_path` in a portable disk but the disk is not connected
default_path = os.path.join(os.path.expanduser("~"), ".pypots", "tsdb")
cached_dataset_dir = default_path
if os.path.abspath(data_home_path) != os.path.abspath(default_path):
logger.warning(
Expand Down

0 comments on commit a57f9e6

Please sign in to comment.