I have an Azure Web App for an AI API using the Questgen question generator project here. This is my fork of the project which has no real change except it can use the latest version of Sense2Vec. My app works fine on my local machine, but after I deploy it and send a Postman GET HTTPS request, which should just return 'Hello World!', I get an error in the container. I've pasted the full container log below. The error is a LookupError from root = nltk.data.find(f"{self.subdir}/{self.__name}")
which is triggering an exception LookupError(resource_not_found)
. This doesn't really make sense though because my code uses nltk.download()
for the resources it needs.
This has me questioning my knowledge of how Azure Web Apps work. I've been learning about Docker containers recently so when the log says that the container failed, does that mean there is a new container process for each incoming HTTP request? If so then that would be a problem because the NLTK resources would be downloaded each time, and that uses the most hardware resources. Also, if that's the case, how can I fix this? Is there a different Azure resource I need to use other than Web Apps or Web APIs?
2021-10-22T02:42:49.727079159Z _____
2021-10-22T02:42:49.727084159Z / _ \ __________ _________ ____
2021-10-22T02:42:49.727094459Z / /_\ \___ / | \_ __ \_/ __ \
2021-10-22T02:42:49.727098059Z / | \/ /| | /| | \/\ ___/
2021-10-22T02:42:49.727101659Z \____|__ /_____ \____/ |__| \___ >
2021-10-22T02:42:49.727106059Z \/ \/ \/
2021-10-22T02:42:49.727109659Z
2021-10-22T02:42:49.727112859Z A P P S E R V I C E O N L I N U X
2021-10-22T02:42:49.727116259Z
2021-10-22T02:42:49.727119459Z Documentation: http://aka.ms/webapp-linux
2021-10-22T02:42:49.727122859Z Python 3.7.9
2021-10-22T02:42:49.727126059Z Note: Any data outside '/home' is not persisted
2021-10-22T02:42:49.845758523Z Starting OpenBSD Secure Shell server: sshd.
2021-10-22T02:42:49.870854191Z App Command Line not configured, will attempt auto-detect
2021-10-22T02:42:49.871712000Z Launching oryx with: create-script -appPath /home/site/wwwroot -output /opt/startup/startup.sh -virtualEnvName antenv -defaultApp /opt/defaultsite -bindPort 8000
2021-10-22T02:42:49.880675195Z Found build manifest file at '/home/site/wwwroot/oryx-manifest.toml'. Deserializing it...
2021-10-22T02:42:49.882728517Z Build Operation ID: |Naw/gARSU78=.7ac7fd71_
2021-10-22T02:42:49.883572726Z Oryx Version: 0.2.20210708.1, Commit: 6ceb6608673b94827bac111ef5ea01c216f92abb, ReleaseTagName: 20210708.1
2021-10-22T02:42:49.884122732Z Output is compressed. Extracting it...
2021-10-22T02:42:49.884876440Z Extracting '/home/site/wwwroot/output.tar.gz' to directory '/tmp/8d99502e5af8c42'...
2021-10-22T02:43:38.292543747Z App path is set to '/tmp/8d99502e5af8c42'
2021-10-22T02:43:38.530015948Z Detected an app based on Flask
2021-10-22T02:43:38.530987862Z Generating `gunicorn` command for 'app:app'
2021-10-22T02:43:38.714114285Z Writing output script to '/opt/startup/startup.sh'
2021-10-22T02:43:39.011214940Z Using packages from virtual environment antenv located at /tmp/8d99502e5af8c42/antenv.
2021-10-22T02:43:39.012112353Z Updated PYTHONPATH to ':/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages'
2021-10-22T02:43:39.760096764Z [2021-10-22 02:43:39 +0000] [36] [INFO] Starting gunicorn 20.1.0
2021-10-22T02:43:39.761640586Z [2021-10-22 02:43:39 +0000] [36] [INFO] Listening at: http://0.0.0.0:8000 (36)
2021-10-22T02:43:39.762272095Z [2021-10-22 02:43:39 +0000] [36] [INFO] Using worker: sync
2021-10-22T02:43:39.767909576Z [2021-10-22 02:43:39 +0000] [39] [INFO] Booting worker with pid: 39
2021-10-22T02:43:48.190191968Z [2021-10-22 02:43:48 +0000] [39] [ERROR] Exception in worker process
2021-10-22T02:43:48.190224768Z Traceback (most recent call last):
2021-10-22T02:43:48.190230668Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/corpus/util.py", line 84, in __load
2021-10-22T02:43:48.190243969Z root = nltk.data.find(f"{self.subdir}/{zip_name}")
2021-10-22T02:43:48.190248069Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/data.py", line 583, in find
2021-10-22T02:43:48.190252069Z raise LookupError(resource_not_found)
2021-10-22T02:43:48.190255769Z LookupError:
2021-10-22T02:43:48.190259369Z **********************************************************************
2021-10-22T02:43:48.190262969Z Resource [93mstopwords [0m not found.
2021-10-22T02:43:48.190267469Z Please use the NLTK Downloader to obtain the resource:
2021-10-22T02:43:48.190271469Z
2021-10-22T02:43:48.190274969Z [31m>>> import nltk
2021-10-22T02:43:48.190278869Z >>> nltk.download('stopwords')
2021-10-22T02:43:48.190282669Z [0m
2021-10-22T02:43:48.190286269Z For more information see: https://www.nltk.org/data.html
2021-10-22T02:43:48.190289869Z
2021-10-22T02:43:48.190293369Z Attempted to load [93mcorpora/stopwords.zip/stopwords/ [0m
2021-10-22T02:43:48.190297069Z
2021-10-22T02:43:48.190300569Z Searched in:
2021-10-22T02:43:48.190304169Z - '/root/nltk_data'
2021-10-22T02:43:48.190307669Z - '/opt/python/3.7.9/nltk_data'
2021-10-22T02:43:48.190311169Z - '/opt/python/3.7.9/share/nltk_data'
2021-10-22T02:43:48.190314670Z - '/opt/python/3.7.9/lib/nltk_data'
2021-10-22T02:43:48.190318370Z - '/usr/share/nltk_data'
2021-10-22T02:43:48.190321970Z - '/usr/local/share/nltk_data'
2021-10-22T02:43:48.190325470Z - '/usr/lib/nltk_data'
2021-10-22T02:43:48.190329070Z - '/usr/local/lib/nltk_data'
2021-10-22T02:43:48.190332570Z **********************************************************************
2021-10-22T02:43:48.190336170Z
2021-10-22T02:43:48.190339670Z
2021-10-22T02:43:48.190343070Z During handling of the above exception, another exception occurred:
2021-10-22T02:43:48.190346670Z
2021-10-22T02:43:48.190350070Z Traceback (most recent call last):
2021-10-22T02:43:48.190353670Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
2021-10-22T02:43:48.190357470Z worker.init_process()
2021-10-22T02:43:48.190361070Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/workers/base.py", line 134, in init_process
2021-10-22T02:43:48.190364870Z self.load_wsgi()
2021-10-22T02:43:48.190368470Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
2021-10-22T02:43:48.190373270Z self.wsgi = self.app.wsgi()
2021-10-22T02:43:48.190379370Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
2021-10-22T02:43:48.190383371Z self.callable = self.load()
2021-10-22T02:43:48.190386971Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
2021-10-22T02:43:48.190390871Z return self.load_wsgiapp()
2021-10-22T02:43:48.190394471Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
2021-10-22T02:43:48.190398371Z return util.import_app(self.app_uri)
2021-10-22T02:43:48.190402571Z File "/opt/python/3.7.9/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
2021-10-22T02:43:48.190406671Z mod = importlib.import_module(module)
2021-10-22T02:43:48.190410271Z File "/opt/python/3.7.9/lib/python3.7/importlib/__init__.py", line 127, in import_module
2021-10-22T02:43:48.190414171Z return _bootstrap._gcd_import(name[level:], package, level)
2021-10-22T02:43:48.190417871Z File "", line 1006, in _gcd_import
2021-10-22T02:43:48.190421771Z File "", line 983, in _find_and_load
2021-10-22T02:43:48.190425571Z File "", line 967, in _find_and_load_unlocked
2021-10-22T02:43:48.190429371Z File "", line 677, in _load_unlocked
2021-10-22T02:43:48.190433171Z File "", line 728, in exec_module
2021-10-22T02:43:48.190437071Z File "", line 219, in _call_with_frames_removed
2021-10-22T02:43:48.190440871Z File "/tmp/8d99502e5af8c42/app.py", line 3, in
2021-10-22T02:43:48.190444771Z from Questgen import main
2021-10-22T02:43:48.190448371Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/Questgen/__init__.py", line 4, in
2021-10-22T02:43:48.190452371Z from Questgen.mcq import mcq
2021-10-22T02:43:48.190455972Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/Questgen/mcq/mcq.py", line 16, in
2021-10-22T02:43:48.190459872Z import pke
2021-10-22T02:43:48.190463472Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/pke/__init__.py", line 5, in
2021-10-22T02:43:48.190467372Z from pke.base import LoadFile
2021-10-22T02:43:48.190470972Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/pke/base.py", line 31, in
2021-10-22T02:43:48.190474872Z lang_stopwords = {get_alpha_2(l): l for l in stopwords._fileids}
2021-10-22T02:43:48.190478572Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/corpus/util.py", line 121, in __getattr__
2021-10-22T02:43:48.190482372Z self.__load()
2021-10-22T02:43:48.190485972Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/corpus/util.py", line 86, in __load
2021-10-22T02:43:48.190492772Z raise e
2021-10-22T02:43:48.190496472Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/corpus/util.py", line 81, in __load
2021-10-22T02:43:48.190500272Z root = nltk.data.find(f"{self.subdir}/{self.__name}")
2021-10-22T02:43:48.190503972Z File "/tmp/8d99502e5af8c42/antenv/lib/python3.7/site-packages/nltk/data.py", line 583, in find
2021-10-22T02:43:48.190507872Z raise LookupError(resource_not_found)
2021-10-22T02:43:48.190511372Z LookupError:
2021-10-22T02:43:48.190514972Z **********************************************************************
2021-10-22T02:43:48.190518572Z Resource [93mstopwords [0m not found.
2021-10-22T02:43:48.190522572Z Please use the NLTK Downloader to obtain the resource:
2021-10-22T02:43:48.190526273Z
2021-10-22T02:43:48.190529773Z [31m>>> import nltk
2021-10-22T02:43:48.190533473Z >>> nltk.download('stopwords')
2021-10-22T02:43:48.190537173Z [0m
2021-10-22T02:43:48.190540773Z For more information see: https://www.nltk.org/data.html
2021-10-22T02:43:48.190544373Z
2021-10-22T02:43:48.190547773Z Attempted to load [93mcorpora/stopwords [0m
2021-10-22T02:43:48.190551473Z
2021-10-22T02:43:48.190554873Z Searched in:
2021-10-22T02:43:48.190558373Z - '/root/nltk_data'
2021-10-22T02:43:48.190561873Z - '/opt/python/3.7.9/nltk_data'
2021-10-22T02:43:48.190565473Z - '/opt/python/3.7.9/share/nltk_data'
2021-10-22T02:43:48.190568973Z - '/opt/python/3.7.9/lib/nltk_data'
2021-10-22T02:43:48.190572573Z - '/usr/share/nltk_data'
2021-10-22T02:43:48.190576073Z - '/usr/local/share/nltk_data'
2021-10-22T02:43:48.190579573Z - '/usr/lib/nltk_data'
2021-10-22T02:43:48.190583073Z - '/usr/local/lib/nltk_data'
2021-10-22T02:43:48.190586673Z **********************************************************************
2021-10-22T02:43:48.190590373Z
2021-10-22T02:43:48.206904802Z [2021-10-22 02:43:48 +0000] [39] [INFO] Worker exiting (pid: 39)
2021-10-22T02:43:48.768375965Z [2021-10-22 02:43:48 +0000] [36] [INFO] Shutting down: Master
2021-10-22T02:43:48.769032074Z [2021-10-22 02:43:48 +0000] [36] [INFO] Reason: Worker failed to boot.