
If you've maintained a website for any length of time, you'll know that links very quickly become broken.
We all move, delete or change pages, and when we do, it not only
results in our own internal links breaking, but other people's links to
our website becoming broken. Similarly, when other people alter their
pages, our own external links become broken. A broken link on your site is a dead end for your visitors and will also be bad news for your search engine optimisation (SEO).
Unless you enjoy clicking every single link on your site followed by
the back button, then you'll need to use a website crawler like
Integrity! Feed it your home page address (url) and Integrity
will follow all of your internal links to find your pages, checking the
server response code for all internal and external links found. Integrity is donationware, which means that it's available to personal users free of charge with no restrictions. I'm very grateful for donations and if you choose to donate, it will encourage further development of this and other OSX software.
Screenshots 

 System Requirements Mac OSX 10.3 or higher. (Note that as from v1.4, 10.2 is no longer supported).
Mac OSX Download
Download Integrity v3 Integrity v3 is here A
collection of new features makes it easier to step through your bad
links and find them on all of the pages which they appear. See the
version history below for full details.
PC Version? If you're of the Windows persuasion, use Xenu's Link Sleuth.
It's the best link checker that I've found, but the developer has made
it clear that he's not interested in producing a specific Mac version.
I've no connection with Tilman Hausherr (though he seems like a great
guy), and this is no more than a personal recommendation to use the
Link Sleuth if you're a pc user. Integrity isn't intended to be a Mac
version of Xenu's Link Sleuth, but was inspired by it.
Version History Version 3.1.1
released March 2009 Fixes bug which stopped further crawling if initial page is redirected. Small efficiency/speed improvement. Fixes bug which could register incorrect links if a request is redirected more than once. Version3.1released December 2008 Time stamp logged for each link checked Views are now customisable - show or hide columns as you like. (Exported files reflect visible columns.) "Redirected" no longer shows in status column as the information is available in its own column New application icon with less transparencyVersion3.02released December 2008 Fixes bug related to unquoted href's Unique
page titles option (was new with v3.0 - crawls site faster and more
accurately if you set this option and if your page titles *are* unique)
now defaults to off for existing configs; defaulting to on was causing
confusion.
Version3.01released December 2008 Fixes bug preventing proper crawling of framesets Fixes problem with pause/continue button Fixes problem with About panelVersion3released December 2008 Adds 'Inspect Bad Links' to View menu (opens the first bad link in the link inspector) Adds 'Next Bad Link' button to link inspector (moves the link inspector to the next bad link if there is one) Adds two new tools to the toolbar for 'Inspect bad links' and 'Inspect selected link' and a 'Customise Toolbar...' menu item Adds
highlighting feature - double-click an 'On page' from the list in the
link inspector, Integrity will open selected page and highlight
selected link with coloured background or coloured border. Adds
drop-down lists to preferences allowing you to choose the style of the
highlighting (border / background, style and width of border) Adds
'Archive pages while crawling' checkbox to preferences (archives pages
while crawling - asks you for a save location when crawl is finished).Version 2.2.2released September 2008 If the link is around an image rather than text, the 'link text' columns will display [img]: and the alt text of the image. 'Redirected to' column added to flat view. Changes
to button bar including addition of export as html, csv and text (tdl)
buttons. Now properly autosaves user customisation. More information in the status display - now also shows how many bad links have been found Version 2.2.1
released September 2008 Was
generating the 'flat view' multiple times, giving the impression of
'hanging' after crawling large sites using lots of threads. Bug fixed,
and progress bar added. Version 2.2
released July 2008 Server
response time is logged. This is the time taken between Integrity
sending the request and receiving the first response. This may not
reflect the actual server response time if Integrity is running a large
number of threads, or if the internet connection is busy When Integrity has finished running, a 'flat' view is available, that can be sorted by any of the columns Global preferences and current config are now combined into one tabbed window Standard customisable toolbar added and main window rearranged. Stop is now renamed 'Pause' Version 2.1released June 2008 Crawls local files (drag the file into the 'starting URL' box) Version 2.0 (beta)
Architecture
/ Logic changed. This fixes thread-safety issues (ie v1.x crashing on
faster machines when using larger number of threads). Architecture
change also makes v2 faster. Now handles sites built using frames. Max
number of threads increased. This was limited in version 1.6.6 as a
quick-fix to thread-safety issues. Max number of threads (when slider
is in 'more' position) is now 29, was 7. 'Threads' are no longer really separate threads owned by Integrity, but simultaneous asynchronous requests. Version 1.6.11 released May 2008 Fixes
bug which was causing some links to be skipped on certain pages.
Integrity's parser was getting confused sometimes by javascript on
pages containing 'less than' and 'greater than' operators. Other small fixes and efficiencies. Version 1.6.10 released April 2008 Progress indicators added to export functions. Link info window now shows all occurrences of a link alongside the link text for each occurrence.Version 1.6.9 released April 2008 Fixes bug related to trimming which randomly prevented complete crawling of whole site. Revised handling of incorrectly nested quotes - now correctly allows for apostrophes as part of url ( "/pdf/Educators'_Guide" ). Help menu now links to support pages of peacockmedia.co.uk, 'Donate' menu option added. Version 1.6.8 released April 2008 Routines for trimming whitespace, querystring etc rewritten in pure C, improving efficiency. Better handling of incorrectly nested single/double quotes ( href = "http://..' ) Now correctly handles base href's which don't give a scheme (assumes http://) Better
trimming of whitespace, ie carriage returns and other control
characters in unexpected places in the middle of <a ..> tags Shows how many times a link occurs, not just how many pages it appears on (ie it may appear multiple times on same page). Version 1.6.7 released March 2008 Fixes
bug which prevented links being found on a page if the end of a comment
and an 'end script' tag were adjacent to each other (
--></script> ) Version 1.6.6 released March 2008
sends user-agent string in header - default is "integrity/1.6" but this
can be changed (see Preferences) if your site needs integrity to appear
to be a recognised browser. Other fixes and efficiencies. Version 1.6.5 released November 2007 'whitelists' and 'blacklists' from the config are no longer case-sensitive. some problems with mcms zref fixed. zrefs are now shown when good links are hidden.
links which are not checked because they are in the blacklist, are
treated as good links. They are hidden when good links are hidden and
are given no colour label. "Hide good links" button has now become
"Show bad links only". This subtle change means that links which have
not been checked will not show and improves running. Small fixes and efficiencies. Version 1.6.4 released October 2007 Fixes problem with tab-delimited file export Both tab-delimited and comma-separated exports are 'flat', ie each 'on page url' has its own row
Fixes crashes or problems caused by carriage returns or whitespace
present within a quoted href (yes, some html has really unexpected
features) Ignores Javascript (anything between <script> tags) More object retention fixes and small efficiencies Version 1.6.2 released September 2007
'on page url' will now recognise 'http://peacockmedia.co.uk' and
'http://peacockmedia.co.uk/' as the same link. Therefore a broken links
may more correctly be reported on a lower number of pages and the whole
application is a little more efficient. Recognises and reports 'zref' links, a difficult-to-find link inserted by Microsoft Content Management Server other small efficiencies and fixes. Version 1.6.1 released July 2007 Some changes to improve stability Version 1.6 released July 2007
Adds user-definable colour labels (see Preferences). A 'good link' is
defined as server response code 2xx, redirected links include any 3xx
code, a bad link is a 4xx code, and an 'error' is a 5xx server code or
any other error. Menu item added View > Info for Current Item
(command-I), shows link inspector pallette (previously only available
via double-click in the main table). Fixes bug causing crash if no internet connection. Version 1.5 released 28 May 2007 Supports base href. Can now export tab-delimited text file along with CSV, plain text and HTML. Improved HTML export - link urls are presented as links. Adds 'Only follow links containing...' field. Fixes bug allowing some 'commented out' urls to be tested. Fixes bug preventing inspector window opening when some links double-clicked. Preferences window added: allows choice of displaying 'on page' as url or page title. Config Starting URL drop-down list behaviour improved . Version 1.4.2 released May 21 2007 No longer parses and extracts links from error pages (eg 404 pages). Now handles spaces in URLs (as long as correctly contained in single or double quotes). Version 1.4 released April 22 2007 Fixes a problem in some earlier versions which prevented all links being found on some pages HTML character entities in links are now 'un-encoded' (eg '&' is replaced with '&') before link is checked. If link appears on more than one page, main table now shows actual number of pages rather than "multiple" 'Re-Check Bad Links' feature added (under File menu) Fixes problem with export to CSV for some sites. NB. early copies of 1.4 give the version number as 1.3.1 in about box. Version 1.3.1 released April 7 2007 Fixes problem with the 'don't check URLs containing' feature which didn't work properly in v1.3 Fixes problem which caused some links to be missed Small improvement to the stop button Version 1.3 released April 6 2007 'This page only' checkbox added. Status display more accurately shows number of links done.
Programme flow, thread safety and object retention improvements. Cures
an instability which seemed to be related to websites which have large
collections of external links and/or setting a larger number of threads. Fixes bug preventing some link text from being recorded properly.
For some file types which may be larger files (pdf, mpg, mp3, jpg) the
parser no longer sends an http request to check the 'Content-Type',
speeding up the crawl time. Version 1.2 released March 29 2007
Now tolerant to excessively long hrefs (previously hrefs over 1000
characters would break an internal limit and cause the application to
crash). Timeout can now be set in the config window. Using a very
large number of threads can obviously make timeouts more likely and so
the timeout figure can now be increased accordingly. The link
inspector window (double-click an entry in the main table) now shows
the 'on page' list in a form which is clickable. A double-click will
open the page in question. The HTML report now shows the 'on page' column as links to the page in question. Version 1.1 released March 26 2007
Link text shows up for more links - link text is still only held once
regardless of how many instances of that link are found on the site,
but if a link has no text (eg image link), then that will not overwrite
the existing link text. Ignores javascript links as well as mailto links. Fixes bug triggered by a return within the tag. Fixes bug which could prevent all links being found on certain pages. Version 1.0 released March 25 2007 First non-beta release, free and not set to expire. Not generally released, but provided to 2 magazine coverdiscs. Version 0.5 (Beta) released March 22 2007 corrected problem which allowed cached data to be checked - new data is now requested every time. Fixes bug which could prevent some links being found if javascript present in page. Version 0.4 (Beta) released March 21 2007 Bug fixed which prevented some relative URLs from being formed correctly
Displays better information about any redirected urls. The final status
code shown is the status for the final (redirected to) URL Link text included as column in main table
Change to programme flow and a number of small refinements and
efficiency improvements meaning that the application remains responsive
throughout larger crawls. Bug fixed which prevented some configs saving properly Version 0.3 (Beta) released March 7 2007 Improved interface, added 'Continue' button, allows Integrity to be paused and re-started. Exporting - results can be exported as HTML, CSV or plain text. Version 0.2 (Beta) released March 1 2007 Fixes bug preventing Integrity from following links where html is all uppercase. Version 0.1 (Beta) released Feb 2007
|