Issue with dateMonthDayYearEN function in Transformation init file #119

BenGraWarBuf · 2023-10-06T22:06:57Z

I was getting the following error when parsing "https://www.sec.gov/Archives/edgar/data/40704/000119312523177500/0001193125-23-177500-index.htm"

File ~\anaconda3\envs\fundamentalchat_classifier2\lib\site-packages\xbrl\instance.py:749, in XbrlParser.parse_instance(self, uri, instance_url, encoding)
745 if uri.split('.')[-1] == 'xml' or uri.split('.')[-1] == 'xbrl':
746 return parse_xbrl_url(uri, self.cache) if uri.startswith('http')
747 else parse_xbrl(uri, self.cache, instance_url)
748 return parse_ixbrl_url(uri, self.cache) if uri.startswith('http')
--> 749 else parse_ixbrl(uri, self.cache, instance_url, encoding)

File ~\anaconda3\envs\fundamentalchat_classifier2\lib\site-packages\xbrl\instance.py:511, in parse_ixbrl(instance_path, cache, instance_url, encoding, schema_root)
509 facts.append(NumericFact(concept, context, fact_value, unit, decimals, xml_id))
510 elif fact_elem.tag == '{' + ns_map['ix'] + '}nonNumeric':
--> 511 fact_value: str = _extract_non_numeric_value(fact_elem)
512 facts.append(TextFact(concept, context, str(fact_value), xml_id))
513 #print(f"Added TextFact with value: {fact_value} and concept: {concept.name}")
514 #print(f"Total facts in XbrlInstance: {len(facts)}")
515 #for fact in facts:
516 #if fact.concept.name == "DocumentType":
517 #print(f"Found DocumentType fact with value: {fact.value}")

File ~\anaconda3\envs\fundamentalchat_classifier2\lib\site-packages\xbrl\instance.py:543, in _extract_non_numeric_value(fact_elem)
541 registryNS: str = fact_elem.attrib['ns_map'][registryPrefix]
542 try:
--> 543 fact_value = normalize(registryNS, formatCode, fact_value)
544 #print(f"Normalized fact_value: {fact_value}")
545 except TransformationNotImplemented:

File ~\anaconda3\envs\fundamentalchat_classifier2\lib\site-packages\xbrl\transformations_init_.py:589, in normalize(namespace, formatCode, value)
587 return ixt2formatCode
588 elif namespace == 'http://www.xbrl.org/inlineXBRL/transformation/2015-02-26':
--> 589 return ixt3formatCode
590 elif namespace == 'http://www.xbrl.org/inlineXBRL/transformation/2020-02-12':
591 return ixt4formatCode

File ~\anaconda3\envs\fundamentalchat_classifier2\lib\site-packages\xbrl\transformations_init_.py:229, in dateMonthDayYearEN(arg)
226 def dateMonthDayYearEN(arg: str) -> str:
227 # Mon(th)(D)D(Y)Y(YY) -> YYYY-MM-DD
228 seg = re.split(r'[^\d\w]+', arg) # split at any char that is not a digit nor a word
--> 229 return f"{yearNorm(seg[2])}-{monthNorm[seg[0]]}-{seg[1].zfill(2)}"

IndexError: list index out of range

i "fixed" it by updating the dateMonthDayYearEN function as follows:
def dateMonthDayYearEN(arg: str) -> str:
# Mon(th)(D)D(Y)Y(YY) -> YYYY-MM-DD
try:
seg = re.split(r'[^\d\w]+', arg) # split at any char that is not a digit nor a word
# Check if the string has three segments and the year segment has 4 digits
if len(seg) != 3 or not (len(seg[2]) == 4 and seg[2].isdigit()):
return arg
return f"{yearNorm(seg[2])}-{monthNorm[seg[0]]}-{seg[1].zfill(2)}"
except Exception as e:
print(f"Error processing date string: {arg}")
raise e
return f"{yearNorm(seg[2])}-{monthNorm[seg[0]]}-{seg[1].zfill(2)}"

manusimidt · 2023-10-10T18:14:26Z

Hello, thanks for the issue.

Yes, thats unfortunately a problem with the filing..
It crashes while parsing the following line:

    <ix:nonnumeric 
      id="ID_2354" 
      name="us-gaap:DebtInstrumentMaturityDate" 
      contextref="FROM_May30_2022_TO_May28_2023_Entity_0000040704_us-gaap_LongtermDebtTypeAxis_gis_FloatingRateNotesDueMay162023Member" 
      format="ixt:datemonthdayyearen" 
      continuedat="XBRL_CS_ce79dd67b6074567a28709eb79e95662_1">May </ix:nonnumeric>

The issue is that the content of the fact "May" does not conform with the format "date-month-day-year-en". This format expects something like "May, 13 2023". However, due to the incorrect format (and missing data), the parser can not parse this filing.

manusimidt self-assigned this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with dateMonthDayYearEN function in Transformation init file #119

Issue with dateMonthDayYearEN function in Transformation init file #119

BenGraWarBuf commented Oct 6, 2023

manusimidt commented Oct 10, 2023 •

edited

Loading

Issue with dateMonthDayYearEN function in Transformation __init__ file #119

Issue with dateMonthDayYearEN function in Transformation __init__ file #119

Comments

BenGraWarBuf commented Oct 6, 2023

manusimidt commented Oct 10, 2023 • edited Loading

Issue with dateMonthDayYearEN function in Transformation init file #119

Issue with dateMonthDayYearEN function in Transformation init file #119

manusimidt commented Oct 10, 2023 •

edited

Loading