An error is encountered when attempting to load a Confluence page that has attachments into Helix GPT.
The ErrorDetails in the record show the following trace information:
Traceback (most recent call last):
File "/opt/bmc/data-connection/app/jobs/service.py", line 417, in handle_job_step
handler(job, job_step, chain, connection)
File "/opt/bmc/data-connection/app/connections/confluence/loader.py", line 25, in load_confluence_page
load_page_attachments(job, job_step, chain, page, confluence_service, connection.id)
File "/opt/bmc/data-connection/app/connections/confluence/loader.py", line 54, in load_page_attachments
load_attachment(job, job_step, chain, attachment, confluence_service, temp_dir_name, connection_id)
File "/opt/bmc/data-connection/app/connections/confluence/loader.py", line 68, in load_attachment
langchain_document = loader.load()
^^^^^^^^^^^^^
File "/opt/bmc/.local/lib/python3.11/site-packages/langchain_community/document_loaders/word_document.py", line 57, in load
page_content=docx2txt.process(self.file_path),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/bmc/.local/lib/python3.11/site-packages/docx2txt/docx2txt.py", line 76, in process
zipf = zipfile.ZipFile(docx)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/zipfile.py", line 1312, in __init__
self._RealGetContents()
File "/usr/local/lib/python3.11/zipfile.py", line 1379, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file