How to avoid duplicated data due to bnds when converting nc to a dataframe? #117
-
Hello Robert! I am converting subsetted files from netCDF to dataframes and then to csv using nctoolkit. After the conversion, I have duplicated data in the csv. The CSV file has a "bnds" column with two values 0,1 which are the duplicates. I read the manual and found out that I could use ds.strip_variables to delete the "bnds" before converting to dataframe, but when applying the method, I got an error saying that "bnds is not a valid variable!". The way I am using the method is I have also tried But they did not work. Could you give me some guidance on how I could solve this issue? I appreciate your help! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
You will need to remove these manually using pandas. So something like:
I don't have the data, so I'm not sure what the bnds refer to. But for ocean data, this is often the maximum and minimum depth for a particular cell. It is is data associated with coordinates, not a data varaible. This kind of information is sometimes useful, so nctoolkit keeps it in the output from However, |
Beta Was this translation helpful? Give feedback.
-
In this case the bounds relate to time. So, there is a Though, in this case it looks like it can be ignored, as it looks like daily data and the time is that day. |
Beta Was this translation helpful? Give feedback.
You will need to remove these manually using pandas. So something like:
I don't have the data, so I'm not sure what the bnds refer to. But for ocean data, this is often the maximum and minimum depth for a particular cell. It is is data associated with coordinates, not a data varaible. This kind of information is sometimes useful, so nctoolkit keeps it in the output from
to_dataframe
. An example is when you can calculate the cell height from the bnds, and need that later on. Though, if you are getting 0s and 1s then there is probably not meaningful information in it.However,
to_dataframe
probably should…