diff --git a/README.md b/README.md index 7380eea0c3cc75bc3ad63736d323fe8a10908cba..37dd7e89c4186cf741877acc2a83ca53bcb2bf60 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,7 @@ Send merge requests if you'd like to add your own! Current sub-directories are: * Email -- advice on how to send emails with Python now that basic password authentication does not work. +* Microsoft -- scripting some Microsoft stuff like MS Forms * Moodle -- scripts for getting info or doings things with Moodle with less clicking (and more fighting with scripts). * LDAP -- scripts for querying LDAP (to get student module diff --git a/microsoft/getting-forms-responses.md b/microsoft/getting-forms-responses.md new file mode 100644 index 0000000000000000000000000000000000000000..e64d6a00d9a1316b9f2963da8530699b60d34acf --- /dev/null +++ b/microsoft/getting-forms-responses.md @@ -0,0 +1,80 @@ + +# Getting MS Forms Responses from Python + +To save you having to keep going to MS Forms to download the spreadsheet if you have a form you need to check often. + +## 1. Get Your Form ID + +You need your form ID, which you can find in the URL of the form as the "id" parameter. E.g. + + https://forms.office.com/Pages/DesignPageV2.aspx?...id=<id>... + +Let's call this `FORM_ID` from now on + + FORM_ID = <id> + +## 2. Get Your Cookies + +If you have logged into MS Forms recently, you'll have some cookies in your browser you can use to get in. You can use these cookies in a Python script using the `browser_cookie3` library. I'll use `browser_cookie3.firefox()` to get FireFox cookies. You might want `browser_cookie3.chrome()` or any of the other ones it supports. + +The important cookies are `AADAuth.forms`, `OIDCAuth.forms`, and `FormsWebSessionId` in the domain `forms.office.com` (except `AADAuth.forms` which has domain `.forms.office.com`…). + +To get the cookies you can use the code below. + + import browser_cookie3 + + FORMS_DOMAINS = set([ + "forms.office.com", + ".forms.office.com", # oh microsoft + ]) + FORMS_COOKIES = set([ + "AADAuth.forms", + "FormsWebSessionId", + "OIDCAuth.forms" + ]) + + cookie_jar = browser_cookie3.firefox() + cookies = { + c.name : c.value + for c in cookie_jar + if c.name in FORMS_COOKIES and c.domain in FORMS_DOMAINS + } + +And check if something went wrong: + + if len(cookies) < len(FORMS_COOKIES): + raise Exception( + f"Missing forms cookies, please make sure you are logged in." + ) + +## 3. Getting the Responses + +We get the responses from a URL that includes the `FORM_ID`. + + RESPONSES_URL = \ + "https://forms.office.com/formapi/DownloadExcelFile.ashx?formid=" \ + + FORM_ID + +To get the responses we'll use a GET with the requests library. The response (if successful -- status code 200) will contain an Excel spreadsheet in its content. + + import requests + + resp = requests.get(RESPONSES_URL, cookies=cookies) + if resp.status_code != 200: + raise Exception("Getting responses failed!") + +## 4. Using the Responses + +To use the responses directly, you might read them into a Pandas dataframe. I used `na_filter=False` and `dtype=str` to avoid Pandas trying to turn things into numbers. You may prefer other options. + + df = pd.read_excel(BytesIO(resp.content), na_filter=False, dtype=str) + +You can iterate over rows as follows. + + for row in df.to_dict(orient="records"): + print(row) + +Alternatively, you could save the spreadsheet directly to a file + + with open("my_form_responses.xlsx", "wb") as f: + f.write(resp.content)