aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authordiogo464 <[email protected]>2025-07-21 15:02:48 +0100
committerdiogo464 <[email protected]>2025-07-21 15:02:48 +0100
commit8c8dabd0ed20679a2dad43a5c239f9fcfe1c1ad7 (patch)
tree55abbcfbbff19efa3aaf6cf36540ac7651c54973 /README.md
init
Diffstat (limited to 'README.md')
-rw-r--r--README.md40
1 files changed, 40 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 000000000..dc718ceeb
--- /dev/null
+++ b/README.md
@@ -0,0 +1,40 @@
1# portugal-running-data
2repo with scraper for the portugal running calendar data
3
4| Filename | Source Script | Optional | Description |
5| ------------------------------------------------------------- | ------------------------------------------------------------- | ------------------------------------------------------------- | ------------------------------------------------------------- |
6| `lastmod` | `setup-directories` | no | last modification time extracted from the sitemap file |
7| `page.html` | `fetch-page` | no | event page from portugalrunning.com |
8| `id` | `extract-id` | no | event numeric id from wordpress |
9| `data.json` | `fetch-data` | no | json file with some event data |
10| `ics` | `fetch-ics` | no | calendar file with location, date and other event information |
11| `location` | `fetch-location` | yes | location data for the event |
12| `image` | `fetch-image` | yes | cover image for the event |
13| `date` | `extract-date` | no | event date extracted from the ics file |
14| `oneline-description` | `fetch-oneline-description` | yes | ai generated one line description |
15| `categories` | `extract-categories` | no | event categories |
16| `circuits` | `extract-circuits` | no | event circuits |
17
18## `fetch-sitemap`
19this script fetches the sitemap that contains a list of event page urls and the last modification date
20
21## `fetch-pages`
22this script will fetch any missing pages or outdated pages by looking at the lastmod file.
23
24## `extract-ids`
25this script will extract the event ids from the page.html file. this id can be used to later fetch other data related to this event.
26
27## `fetch-ics`
28this script uses the event id and fetches its ics file.
29
30## `fetch-data`
31this script uses the event id to fetch some event data in json format.
32
33## `fetch-images`
34some events have a main image in the json data file, this script will fetch that image.
35
36## `extract-organizer`
37this script extracts the organizer from the class list in the json data file, if one exists.
38
39## `extract-categories`
40this script extracts a list of categories from the class list in the json data file.