Simply, the algorithm is (for page text links):
- select pages / forum / something else content (can be done in several steps if you had huge amount of data)
- parse it for links with RegExp
- [optional] save in cache list of links and pages modified date
- list it in admin tools window
- check it alive with http request by curl
- show results and links to source page to modify it if neccessary
- [optionally] you can PM or save list
For DL links it even simplier:
- select DL link from pages
- check file is exists (file_exists for local files and curl for external)
- show results