As always in IT-projects, the devil is in the details. Looking more deeply in the result of the first migration results, I found some issues that had to be corrected. Mainly the issues around the curious LaTex phrases I mentioned in the last blog have been a pain to solve. Mainly the unwanted changes in character sets during the whole process (reading xml, treating text in php, transfer to database, compilation to html-pages in Q2A) and the treating of html-tags have been awkward. However, we have now a local site with all posts and votings (still useing the dummy user hack) of SE.TP migrated to Q2A, as shown in the following example:
Note that two subcategories SE.TP and SE.TP.Meta have been created. As an example, how attribution (at least for SE.TP, but not yet for SE.Physics) could look like, see the text at the right side panel. LaTex look now fine in all posts. Also Meta has now been included, as shown in the post from Shog9 with with its far-reaching consequences (the SE.TP site has been closed a short time later):
I’d like to discuss some spots on the continuation of this project:
We have now two subcategories SE.TP and SE.TP.Meta from the closed SE beta page on Area 51 Stack Exchange. A simple method to handle attribution for these posts would be a text as in the side bar as in the sample above and to prevent users to insert additional posts in these two categories. Binding attributions to single posts would require a core hack in the Q2A-code and take much time and risk. Actually I do not yet know how imported posts from the running SE.Physics site could be handled.
The SE.TP dump contains an additional file called history.xml. It contains change history with the following type-ids included
- Edit Title – A question’s title has been changed.
- Edit Body – A post’s body has been changed, the raw text is stored here as markdown.
- Edit Tags – A question’s tags have been changed.
- Rollback Title – A question’s title has reverted to a previous version.
- Rollback Body – A post’s body has reverted to a previous version – the raw text is stored here.
- Rollback Tags – A question’s tags have reverted to a previous version.
- Post Closed – A post was voted to be closed.
- Post Reopened – A post was voted to be reopened.
- Post Deleted – A post was voted to be removed.
- Post Undeleted – A post was voted to be restored.
- Post Locked – A post was locked by a moderator.
- Post Unlocked – A post was unlocked by a moderator.
- Community Owned – A post has become community owned.
- Post Migrated – A post was migrated.
- Question Merged – A question has had another, deleted question merged into itself.
- Question Protected – A question was protected by a moderator.
- Question Unprotected – A question was unprotected by a moderator.
- Post Disassociated – An admin removes the OwnerUserId from a post.
- Question Unmerged – A previously merged question has had its answers and votes restored.
In my opinion it could be very complicated to write a program flow that inserts all these changes into the actual site. What is your opinion, is that really required? It may delay the start of the new site for a long time if we would implement all these tags. To give you an idea: Only the simple migration of the actual local site required more than 600 lines of code. Does anybody have an idea, if the dumped posts from SE contain the posts before or after these changes?
As proposed by Dimension10, the easiest way to recover the accounts of former users would be the check with the existing hashes with MD5 encryption. I will write a plug-in, where the user may enter his email address and password as he used in SE. Both will then be checked against the MD5 hashes in the SE.TP dump. If successful, their account will be automatically restored. Writing such a plugin should be quite easy (except the devil …).
Migration of posts from SE.Physics
I will study a direct access using Stack.PHP and Stack Exchange API to SE.Physics . If the primary keys of the database (user ID and Post Id) are available, it should also be possible to write a plug-in, that enables users to insert single posts directly from SE.Physics. However, the issue of attribution is not yet solved. The community should also discuss, who shall have the right to migrate such questions.