Hello readers!
Last week I was surprised when got the following message on Microsoft Blogs (eaxmple: https://blogs.technet.microsoft.com/crypto):
After some investigation, more disabled blogs were found. I tried to find any information about what is going on, but not much luck. All I was able to find is the fact that Microsoft is retiring their TechNet and MSDN platforms and move to..yes, another blogging engine. Though, not all blogs are moved. There are various rumors (not yet official) and they suggest that only most popular and trending (Azure!) blogs will be migrated. The rest blogs will be wiped. Silently. Other rumors suggest that it is blogs owner’s responsibility to move their blog to a new platform. Keep in mind, these are just rumors, the fact is that blogs silently disappear: https://blogs.technet.microsoft.com/brandonlinton/2018/11/05/retirement/. There was no official announcement from Microsoft about the trend or blog decommission schedule. Further investigation revealed that MSDN blogs are mosing to DevBlogs and TechNet blogs are moving to TechCommunity.
Microsoft blogs were launched somewhere in the mid of 2003 on a customized version of Telligent Community Server. By the time, Community Server was the only available multi-user blogging engine powered by .NET. As new blogs were added and their popularity grew it was clearly evident that Community Server cannot handle the grewing load of Microsoft blogs and platform functionality and scalability was limited.
In 2015, Microsoft decided to move all their blogs to a proven and scalable solution based on a Wordpress blog engine.
At the end of 2018 (at the time of posting, the process still continues) Microsoft started a new blogging platform. There was no announcement about another blog migration. However, some blogs were migrated. Based on my little research, DevBlogs and TechCommunity platforms are powered by Lithium blogging engine.
Microsoft blogs reached their prime in 2008-2013. After 2013, more and more blogs were abandoned by their owners and overal postings activity slowed significantly. As of March 2019, only few dozens of blogs are actively updated, the rest are no longer updated (owner could change his position, retire from MSFT or have any other reason).
It is a fact that Microsoft blogs are extremely valuable and popular within IT Pro and software developers communities. Blogs contain literally a “shitload” of technical gems about Microsoft products, internals in deep details and other hidden knowledge you will never find anywhere else. A lot of interesting support case stories product teams faced were posted as well. Blogs were used to announce new products, features, their explanation before more formal information is delivered to TechNet and MSDN web sites.
Microsoft put a lot of efforts in blogs promotion, community cultivation, as the result blogs are loved by community. And now, Microsoft silently, without announcement, shutdown blogs and remove the content! I have no idea about the criteria used to schedule particular blog shutdown. Some rumors suggest that blogs with low traffic are discontinued firts, however all inactive blogs will be discontinued eventually. That’s pitty! Even if particular blog is no longer updated, its information is still relevant in most cases. IT-related websites are full of links to Microsoft blogs and their posts. Now, these links are slowly dying and soon most will show you 404. Without any explanation.
There is a chance to recover some links by web archive, though not all blogs or posts were indexed by web archive. Windows 7 and Windows 8 development blogs in Russian weren’t backed up by web archive. And if you don’t have exact link, web archive doesn’t help much. When Windows Server 2003 TechNet content was retired, Microsoft released a compiled PDF version of retired content: Windows Server 2003/2003 R2 Retired Content. There is no such solution of for blogs. No PDF, no other offline copy, nothing. Microsoft literally spit in the face of the community they cultivated with blogs. I can’t find other words to express my feeling when I face 404 on one or another blog I’m trying to read.
I’m a huge fan of various old stuff and trying to collect everything I see interesting to me. If I would know about Microsoft blogs shutdown in advance, I would react accordingly, backup blogs while they were alive. Fortunately, not all blogs are wiped at this moment and I quickly wrote a PowerShell solution to download entire chosen blog with every post content. Further, I added image (if still active) download and URL rewrite within posts (which appear in <a>
and <img>
HTML tags).
The logic is quite simple:
<a>
tag) download it as wellThe downloaded data structure is as follows:
Example HTML
Unfortunately, PowerShell [xml]
type accelerator doesn’t work with blogs HTML (because of unescaped JS scripts), as the result, I was forced to use 3rd party library that made the HTML parsing job very easy.
The script relies on a 3rd party library dependency called HtmlAgilityPack. You must download the dll file and put it in the same directory where PowerShell script is located.
Vadim Sterkin published sample blogs in styled HTML and PDF formats. Online repository of already downloaded blogs along with instructions is stored on Google Drive.
Script download button:
The package includes PowerShell script, downloaded HtmlAgilityPack and stylesheet file with custom theme.
And script example usage:
.\Backup-MsftBlog.ps1 –BlogUri https://blogs.technet.microsoft.com/pki/ –OutputDirectory .\blogs\pki
In the -BlogUri
parameter you specify a full URL to blog’s main page and in the -OutputDirectory
parameter specify the folder where the blog will be downloaded. The script implements a -Verbose
switch to get the log of crawling process. Use this script if you wish to get your own copy of Microsoft blog you like which can be wiped at any moment.
HI Vadims,
Thanks for sharing the Backup-MsftBlog.ps1. it is really helpful to take a backup of MSDN blogs.
i tried to take a backup of https://blogs.msdn.microsoft.com/kaushal blogs but it threw an error stating that "Invoke-WebRequest : This site uses cookies for analytics, personalized content and ads. By continuing to browse this site, you agree to this use. Learn more"
Please help.
You can try this workaround: https://stackoverflow.com/a/33678396/3997611
Great idea archiving the PKI site. Hope Microsoft realizes the magnitude of this shutdown.
I am getting the following error, even after running as admin and re-setting the execution policy:
"Add-Type : Could not load file or assembly 'file:///C:\users\v-something\desktop\Backup-MsftBlog\HtmlAgilityPack.dll' or
one of its dependencies. Operation is not supported. (Exception from HRESULT: 0x80131515)
At C:\users\v-something\desktop\Backup-MsftBlog\Backup-MsftBlog.ps1:25 char:1
+ Add-Type -Path HtmlAgilityPack.dll "
Any idea how to circumvent this?
@Randolph, make sure if you have unblocked the .dll. If you download the file from internet, an alternate stream is added to indicate the file source. Right-click on dll and script file and press "Unblock" if such option is visible.
Thank you! I had changed the Execution Policy for the script but had not unblocked the .dll.
Вот поэтому и лучше свое творение держать на своем домене и хостинге, и никакие нововвдения вам не страшны. Главное делать бэкапы
Hey,
This subredd might be useful: https://www.reddit.com/r/sysadmin/comments/fmrywb/technet_gallery_backup_site/
I already handled this: https://twitter.com/Crypt32/status/1241414577524371456
Post your comment:
Comments: