Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupervised training #3

Open
mar-muel opened this issue Jun 17, 2020 · 1 comment
Open

Unsupervised training #3

mar-muel opened this issue Jun 17, 2020 · 1 comment
Assignees

Comments

@mar-muel
Copy link
Collaborator

Create new command pretrain which does two things:

  1. preprocess the data (potentially cache this)
  2. runs fasttext unsupervised
  3. stores model artefacts etc. into output/{run_name}

Potentially using cached helper

def cached(f_name):
    """Uses a shelve to pickle return values of function calls"""
    cache_path = get_cache_path(f_name)
    def cacheondisk(fn):
        db = shelve.open(cache_path)
        @wraps(fn)
        def usingcache(*args, **kwargs):
            __cached = kwargs.pop('__cached', True)
            key = repr((args, kwargs))
            key = hashlib.md5(key.encode('utf-8')).hexdigest()[:10]
            if not __cached or key not in db: 
                ret = db[key] = fn(*args, **kwargs)
                logger.info(f'Saved data for {f_name} using key {key}')
            else:
                logger.info(f'Loading data for {f_name} using key {key}...')
                ret = db[key]
            return ret 
        return usingcache
        db.close()
    return cacheondisk
@mar-muel mar-muel assigned utanashati and mar-muel and unassigned mar-muel Jun 17, 2020
@mar-muel
Copy link
Collaborator Author

mar-muel commented Jun 24, 2020

  • Each model has a pretrain function (for the base model)
  • Pass a config to the pretrain as well
  • Save it to the other/models/pretrain
  • pretrain_path: defaults to other/models/pretrain
  • Set in config_reader.py: line 134, def _get_default_paths(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants