From @rithvikmahin #24
Cause of issue: The sources found include PDF documents and academic papers that are very long, with over 1.5 million characters. SpaCy takes too long (over 30 seconds) to run nlp(text) and create a document object from the text and stalls the entire processing system.
Temporary solution: Created a timer that stops processing that document if it takes longer than 30 seconds and moves on to the next one.
Potential solution / TODO: Add a queue for all tweets that take longer than 30 seconds to process, and return a "Will provide the source later" statement to the user. Once the tweets are processed, return them to the user at any point in time later.
From @rithvikmahin #24
Cause of issue: The sources found include PDF documents and academic papers that are very long, with over 1.5 million characters. SpaCy takes too long (over 30 seconds) to run nlp(text) and create a document object from the text and stalls the entire processing system.
Temporary solution: Created a timer that stops processing that document if it takes longer than 30 seconds and moves on to the next one.
Potential solution / TODO: Add a queue for all tweets that take longer than 30 seconds to process, and return a "Will provide the source later" statement to the user. Once the tweets are processed, return them to the user at any point in time later.