Integrity
Version History
Version 3.7.3
released November 2011
Links to subdomains can be considered as internal rather than external. ie peacockmedia.co.uk and www.peacockmedia.co.uk are considered the same site (which is not necessarily true but most people would expect) and therefore both are followed. Adds checkbox in global preferences to switch this option. Default is on. With the option on, Integrity will discover more links (and potentially more bad links) on certain websites. Option needs to be switched off if you wish to deliberately limit your crawl to one subdomain
Fixes memory problem, helping application to deal with larger sites
Bug fix and small improvement to 'my sites' drawer
Closing main window quits application after 'are you sure' dialogue
Version 3.7.2
released October 2011
Exports .dot file (standard format used by graphing applications) which can be opened as a visualisation in third-party graphing apps. includes colour to indicate levels. Accessed via File>Export or a new toolbar button added via 'Customize toolbar...'
Fixes problems with 'Re-check broken links' and 'Re-check this link'
Fixes 'on page as title / url' preference (broken in last version)
Adds 'Getting started' to the Help menu and splash screen
replaces 'Bad links' icon with a more suitable one (previous one looks like 'delete')
Fixes glitch with 'Inspect selected' button when flat view is showing
Version 3.7.1
released October 2011
Single version compatible with OSX 10.4 Tiger through to 10.7 Lion (minimum Intel / ppc 10.4)
(since v3.6, an older version, v3.5 was offered to Tiger users)
Improvements to user interface: toolbar - customisation includes space and flexible space, contents of settings tab move to fill the space as main window is resized
Fixes problem of user not being able to get main window open again if closed
Fixes bug causing base href not to be discovered which could lead to many improperly-constructed relative urls
Fixes distance column in flat view
Version 3.7
released August 2011
OSX 10.7 Lion compatible
Improves 'My Sites' - allows the same url to be saved more than once with different settings.
Version 3.6
released May 2011
Ability to import list of links, either html format or plain text list
Online manual linked from Help menu, includes instructions for crawling sites locally and importing a list of links
Moves list of sites from drop-down list to 'my sites' pop-out drawer
'last checked' date and status is stored and displayed
Last used settings are saved and visible on launch
Minimum system requirements now Intel / 10.5
Version 3.5.4
released March 2011
Fixes bug relating to empty href's.
improved reporting of link text and page titles which contain non-ascii characters.
Version 3.5.3
released January 2011
efficiency improvements (using internal cache rather than copying data, object retention / release)Version 3.5.2
released November 2010
New option to allow 'not followed' links to be excluded from sitemap.
Fixes bug preventing Integrity from recognising a link if it has a carriage return immediately after the a.
Version 3.5.1
released October 2010
German localisation added.
Fixes bug causing crashes if internet connection fails or isn't stable.
Allows copy of url from 'on page' column of link inspector (as per filenames, requires two single-clicks to select the url - note that a double-click opens the page and attempts to highlight the link on the page using a style set in Preferences).
Fixes bug causing crawl to stop if starting url is redirected.
Version 3.4.1
released October 2010
Fixes bug causing random crashes introduced with major changes in 3.4
Version 3.4
released October 2010
Better string handling for urls and link text - makes running more efficient and correctly displays link text which includes non-ascii (non-English) characters.
Reduced background status logging also makes for faster running.
Fixes bug preventing sorting of flat view with 'bad links only' showing.
Fixes bug preventing generation of flat view if 'bad links only' showing when crawl finishes.
Other small fixes.
Version 3.3.6a
released September 2010
Fixes bug which caused instability with certain sites when using more threads.
Version 3.3.6
released September 2010
Fixes bug causing random crashes, especially when losing internet connection
Adds option to highlight missing link urls (where href = "#" or "" )
Version 3.3.5
released July 2010
Fixes bug preventing 'highlight link on page' feature working properly.
Fixes bug preventing crawling if comment terminated with more than two dashes eg '--->'
Fixes bug which prevented proper crawling if return or other characters were present inside </script> tag.
Version 3.3.4
released June 2010
Fixes bug which prevented proper crawling if return characters were present inside the <a> tag.
Version 3.3.3
released May 2010
Fixes bug which could cause crashing if using a custom user-agent string.
Context help added for some options.
Version 3.3.2
released May 2010
Minor improvements when checking sites on a local drive; improves adding 'file://' before crawling, and fixes bug preventing proper crawling.
Version 3.3.1
released April 2010
Adds setting - 'don't check external links' - makes crawl faster if you only need to generate a sitemap.
Version 3.3
released January 2010
Checks distance of each url from home page. Can be displayed as a column in Integrity's table views and exported files. See Preferences to switch this column on or off.
Generates XML sitemap. Note that the sitemap will be generated according to settings for the url crawled. (ie it is important to have settings like 'page titles are unique' or 'ignore querystrings' set correctly). Priority can be filled in automatically based on distance from home page.
Version 3.2
released November 2009
Changes to the user interface. Current url is displayed in a combo box along with the 'go' button at the top of the main window. The settings for the current url (previously called 'current config') are now displayed in the default tab of the main window. Flat and sortable views are now switched using tab buttons at the bottom of the main window.
Option for checking broken images added. Image urls are denoted by [img src] in the link text column.
Bug fix - alt text is now correctly shown (if it exists) in the link text column when the link contains an image rather than text. For example: [linked image]:NHS Direct
Some improvements to saving / deleting of settings for current site.
Auto-complete added to main url combo box. However, this only works if you type the 'http' or 'www' or however the saved url starts.
Progress indicator added for 'recheck this link'. Response time and time stamp are also correctly updated.
Help and donate links updated.
Automatic checking for updates. Checks for updates on startup. If a new version is available, informs user and invites visit to download page.
Version 3.1.2
released October 2009
Explicitly doesn't handle cookies (random behaviour previously).
will now pick up links withing imagemap area tags.
Version 3.1.1
released March 2009
Fixes bug which stopped further crawling if initial page is redirected.
Small efficiency/speed improvement.
Fixes bug which could register incorrect links if a request is redirected more than once.
Version3.1
released December 2008
Time stamp logged for each link checkedViews are now customisable - show or hide columns as you like. (Exported files reflect visible columns.)
"Redirected" no longer shows in status column as the information is available in its own column
New application icon with less transparency
Version3.02
released December 2008
Fixes bug related to unquoted href'sUnique page titles option (was new with v3.0 - crawls site faster and more accurately if you set this option and if your page titles *are* unique) now defaults to off for existing configs; defaulting to on was causing confusion.
Version3.01
released December 2008
Fixes bug preventing proper crawling of framesetsFixes problem with pause/continue button
Fixes problem with About panel
Version3
released December 2008
Adds 'Inspect Bad Links' to View menu (opens the first bad link in the link inspector)Adds 'Next Bad Link' button to link inspector (moves the link inspector to the next bad link if there is one)
Adds two new tools to the toolbar for 'Inspect bad links' and 'Inspect selected link' and a 'Customise Toolbar...' menu item
Adds highlighting feature - double-click an 'On page' from the list in the link inspector, Integrity will open selected page and highlight selected link with coloured background or coloured border.
Adds drop-down lists to preferences allowing you to choose the style of the highlighting (border / background, style and width of border)
Adds 'Archive pages while crawling' checkbox to preferences (archives pages while crawling - asks you for a save location when crawl is finished).
Version 2.2.2
released September 2008
If the link is around an image rather than text, the 'link text' columns will display [img]: and the alt text of the image.
'Redirected to' column added to flat view.
Changes
to button bar including addition of export as html, csv and text (tdl)
buttons. Now properly autosaves user customisation.
More information in the status display - now also shows how many bad links have been found
Version 2.2.1
released September 2008
Was generating the 'flat view' multiple times, giving the impression of 'hanging' after crawling large sites using lots of threads. Bug fixed, and progress bar added.Version 2.2
released July 2008
Server response time is logged. This is the time taken between Integrity sending the request and receiving the first response. This may not reflect the actual server response time if Integrity is running a large number of threads, or if the internet connection is busy
When Integrity has finished running, a 'flat' view is available, that can be sorted by any of the columns
Global preferences and current config are now combined into one tabbed window
Standard customisable toolbar added and main window rearranged. Stop is now renamed 'Pause'
Version 2.1
released June 2008
Crawls local files (drag the file into the 'starting URL' box)
Version 2.0 (beta)
Architecture
/ Logic changed. This fixes thread-safety issues (ie v1.x crashing on
faster machines when using larger number of threads). Architecture
change also makes v2 faster.
Now handles sites built using frames.
Max
number of threads increased. This was limited in version 1.6.6 as a
quick-fix to thread-safety issues. Max number of threads (when slider
is in 'more' position) is now 29, was 7.
'Threads' are no longer really separate threads owned by Integrity, but simultaneous asynchronous requests.
Version 1.6.11
released May 2008Fixes bug which was causing some links to be skipped on certain pages. Integrity's parser was getting confused sometimes by javascript on pages containing 'less than' and 'greater than' operators.
Other small fixes and efficiencies.
Version 1.6.10
released April 2008Progress indicators added to export functions.
Link info window now shows all occurrences of a link alongside the link text for each occurrence.
Version 1.6.9
released April 2008Fixes bug related to trimming which randomly prevented complete crawling of whole site.
Revised handling of incorrectly nested quotes - now correctly allows for apostrophes as part of url ( "/pdf/Educators'_Guide" ).
Help menu now links to support pages of peacockmedia.co.uk, 'Donate' menu option added.
Version 1.6.8
released April 2008
Routines for trimming whitespace, querystring etc rewritten in pure C, improving efficiency.
Better handling of incorrectly nested single/double quotes ( href = "http://..' )
Now correctly handles base href's which don't give a scheme (assumes http://)
Better
trimming of whitespace, ie carriage returns and other control
characters in unexpected places in the middle of <a ..> tags
Shows how many times a link occurs, not just how many pages it appears on (ie it may appear multiple times on same page).
Version 1.6.7
released March 2008
Fixes
bug which prevented links being found on a page if the end of a comment
and an 'end script' tag were adjacent to each other (
--></script> )
Version 1.6.6
released March 2008
sends user-agent string in header - default is "integrity/1.6" but this
can be changed (see Preferences) if your site needs integrity to appear
to be a recognised browser.
Other fixes and efficiencies.
Version 1.6.5
released November 2007
'whitelists' and 'blacklists' from the config are no longer case-sensitive.
some problems with mcms zref fixed. zrefs are now shown when good links are hidden.
links which are not checked because they are in the blacklist, are
treated as good links. They are hidden when good links are hidden and
are given no colour label.
"Hide good links" button has now become
"Show bad links only". This subtle change means that links which have
not been checked will not show and improves running.
Small fixes and efficiencies.
Version 1.6.4
released October 2007
Fixes problem with tab-delimited file export
Both tab-delimited and comma-separated exports are 'flat', ie each 'on page url' has its own row
Fixes crashes or problems caused by carriage returns or whitespace
present within a quoted href (yes, some html has really unexpected
features)
Ignores Javascript (anything between <script> tags)
More object retention fixes and small efficiencies
Version 1.6.2
released September 2007
'on page url' will now recognise 'http://peacockmedia.co.uk' and
'http://peacockmedia.co.uk/' as the same link. Therefore a broken links
may more correctly be reported on a lower number of pages and the whole
application is a little more efficient.
Recognises and reports 'zref' links, a difficult-to-find link inserted by Microsoft Content Management Server
other small efficiencies and fixes.
Version 1.6.1
released July 2007
Some changes to improve stability
Version 1.6
released July 2007
Adds user-definable colour labels (see Preferences). A 'good link' is
defined as server response code 2xx, redirected links include any 3xx
code, a bad link is a 4xx code, and an 'error' is a 5xx server code or
any other error.
Menu item added View > Info for Current Item
(command-I), shows link inspector pallette (previously only available
via double-click in the main table).
Fixes bug causing crash if no internet connection.
Version 1.5
released 28 May 2007
Supports base href.
Can now export tab-delimited text file along with CSV, plain text and HTML.
Improved HTML export - link urls are presented as links.
Adds 'Only follow links containing...' field.
Fixes bug allowing some 'commented out' urls to be tested.
Fixes bug preventing inspector window opening when some links double-clicked.
Preferences window added: allows choice of displaying 'on page' as url or page title.
Config Starting URL drop-down list behaviour improved .
Version 1.4.2
released May 21 2007
No longer parses and extracts links from error pages (eg 404 pages).
Now handles spaces in URLs (as long as correctly contained in single or double quotes).
Version 1.4
released April 22 2007
Fixes a problem in some earlier versions which prevented all links being found on some pages
HTML character entities in links are now 'un-encoded' (eg '&' is replaced with '&') before link is checked.
If link appears on more than one page, main table now shows actual number of pages rather than "multiple"
'Re-Check Bad Links' feature added (under File menu)
Fixes problem with export to CSV for some sites.
NB. early copies of 1.4 give the version number as 1.3.1 in about box.
Version 1.3.1
released April 7 2007
Fixes problem with the 'don't check URLs containing' feature which didn't work properly in v1.3
Fixes problem which caused some links to be missed
Small improvement to the stop button
Version 1.3
released April 6 2007
'This page only' checkbox added.
Status display more accurately shows number of links done.
Programme flow, thread safety and object retention improvements. Cures
an instability which seemed to be related to websites which have large
collections of external links and/or setting a larger number of threads.
Fixes bug preventing some link text from being recorded properly.
For some file types which may be larger files (pdf, mpg, mp3, jpg) the
parser no longer sends an http request to check the 'Content-Type',
speeding up the crawl time.
Version 1.2
released March 29 2007
Now tolerant to excessively long hrefs (previously hrefs over 1000
characters would break an internal limit and cause the application to
crash).
Timeout can now be set in the config window. Using a very
large number of threads can obviously make timeouts more likely and so
the timeout figure can now be increased accordingly.
The link
inspector window (double-click an entry in the main table) now shows
the 'on page' list in a form which is clickable. A double-click will
open the page in question.
The HTML report now shows the 'on page' column as links to the page in question.
Version 1.1
released March 26 2007
Link text shows up for more links - link text is still only held once
regardless of how many instances of that link are found on the site,
but if a link has no text (eg image link), then that will not overwrite
the existing link text.
Ignores javascript links as well as mailto links.
Fixes bug triggered by a return within the tag.
Fixes bug which could prevent all links being found on certain pages.
Version 1.0
released March 25 2007
First non-beta release, free and not set to expire. Not generally released, but provided to 2 magazine coverdiscs.
Version 0.5 (Beta)
released March 22 2007
corrected problem which allowed cached data to be checked - new data is now requested every time.
Fixes bug which could prevent some links being found if javascript present in page.
Version 0.4 (Beta)
released March 21 2007
Bug fixed which prevented some relative URLs from being formed correctly
Displays better information about any redirected urls. The final status
code shown is the status for the final (redirected to) URL
Link text included as column in main table
Change to programme flow and a number of small refinements and
efficiency improvements meaning that the application remains responsive
throughout larger crawls.
Bug fixed which prevented some configs saving properly
Version 0.3 (Beta)
released March 7 2007
Improved interface, added 'Continue' button, allows Integrity to be paused and re-started.
Exporting - results can be exported as HTML, CSV or plain text.
Version 0.2 (Beta)
released March 1 2007
Fixes bug preventing Integrity from following links where html is all uppercase.
Version 0.1 (Beta)
released Feb 2007
