Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make replace parameter work in edit_dataset_metadata #146

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions src/pyDataverse/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ def post_request(self, url, data=None, auth=False, params=None, files=None):
"ERROR: POST - Could not establish connection to API: {0}".format(url)
)

def put_request(self, url, data=None, auth=False, params=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should stay None.
It is almost always better to use None instead of {} or [], the common pattern being:

def fun(arg=None):
    if arg is None:
        arg = {}

The reason is, roughly speaking, that python will process the def-line at its first pass through the code, initializing all arguments given there once and storing the references.
If the underlying argument would then, e.g., be appended to, this would modify the same object for each call. Consider:

def fun(arg=[]):
    arg.append(1)
    print(arg)
fun()
fun()
fun()

This would print:

[1]
[1, 1]
[1, 1, 1]

However, if the expected output is [1], [1], [1], the code should be:

def fun(arg=None):
    if arg is None:
        arg = []
    arg.append(1)
    print(arg)
fun()
fun()
fun()

While it is not so important here, I would still recommend sticking with the convention unless there's good reason not to.

For more information, see https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects.

def put_request(self, url, data=None, auth=False, params={}):
"""Make a PUT request.

Parameters
Expand All @@ -198,15 +198,14 @@ def put_request(self, url, data=None, auth=False, params=None):
Should an api token be sent in the request. Defaults to `False`.
params : dict
Dictionary of parameters to be passed with the request.
Defaults to `None`.
Defaults to `{}`.

Returns
-------
requests.Response
Response object of requests library.

"""
params = {}
params["User-Agent"] = "pydataverse"
if self.api_token:
params["key"] = self.api_token
Expand Down Expand Up @@ -1275,7 +1274,7 @@ def edit_dataset_metadata(
Get dataset metadata::

>>> data = api.get_dataset(doi).json()["data"]["latestVersion"]["metadataBlocks"]["citation"]
>>> resp = api.edit_dataset_metadata(doi, data, is_replace=True, auth=True)
>>> resp = api.edit_dataset_metadata(doi, data, replace=True, auth=True)
>>> resp.status_code
200: metadata updated

Expand All @@ -1288,7 +1287,7 @@ def edit_dataset_metadata(
url = "{0}/datasets/editMetadata/{1}".format(
self.base_url_api_native, identifier
)
params = {"replace": True} if replace else {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can stay as it is. Maybe with requests this was necessary, but with httpx I tried this:

>>> import httpx
>>> httpx.get('https://example.org', params={'bool': True, 'string': 'true'}).url
URL('https://example.org?bool=true&string=true')

So both seem to result in the same outcome. One could even consider using params = {"replace": replace} instead of the ternary. But I am not sure if the API supports it, so I would keep it as-is.

params = {"replace": "true"} if replace else {}
resp = self.put_request(url, metadata, auth, params)

if resp.status_code == 401:
Expand All @@ -1304,7 +1303,7 @@ def edit_dataset_metadata(
else:
print(
"You may not add data to a field that already has data and does not"
" allow multiples. Use is_replace=true to replace existing data."
" allow multiples. Use replace=True to replace existing data."
)
elif resp.status_code == 200:
print("Dataset '{0}' updated".format(identifier))
Expand Down