NASA Enterprise Directory

Names and contact information of NASA employees and contractors. 102,615 entries, each containing name, email, and phone.

Data and Resources

Additional Info

Field Value
Source https://people.nasa.gov
Version
Author
Author Email
Maintainer
Maintainer Email
Shared (this field will be removed in the future) Open
IB1 Sensitivity Class
IB1 Trust Framework
IB1 Dataset Assurance
IB1 Trust Framework
Free text description of capture process Python: Selenium and PhantomJS for scrape, LXML for parse. Ran an exhaustive series of searches by constructing URLs. Began by searching the email field for all valid two-character combinations, followed by the wildcard '*'. If a search returned too many results to display on one page (more than 100), exhaustively appended an additional character in the next round, and so on. The process ended when searches no longer returned too many results to display on a single page. To find directory listings without email addresses, I repeated the process for last names, first names, and phone numbers. If a field included >100 identical entries, I constructed additional search loops on a case-by-case basis, all of which are included in the attached scripts. Because pages were rendered using JavaScript, I used a headless browser via Selenium and PhantomJS in Python to convert pages to static HTML. I parsed the resulting HTML files using LXML in Python, then wrote all data to a comma-delimited CSV using the package unicodecsv.