"The great Father Zed, Archiblogopoios"
- Fr. John Hunwicke
"Some 2 bit novus ordo cleric"
"Rev. John Zuhlsdorf, a traditionalist blogger who has never shied from picking fights with priests, bishops or cardinals when liturgical abuses are concerned."
"Father John Zuhlsdorf is a crank"
"Father Zuhlsdorf drives me crazy"
"the hate-filled Father John Zuhlsford" [sic]
"Father John Zuhlsdorf, the right wing priest who has a penchant for referring to NCR as the 'fishwrap'"
"Zuhlsdorf is an eccentric with no real consequences" - HERE
- Michael Sean Winters
"Fr Z is a true phenomenon of the information age: a power blogger and a priest."
- Anna Arco
“Given that Rorate Coeli and Shea are mad at Fr. Z, I think it proves Fr. Z knows what he is doing and he is right.”
"Let me be clear. Fr. Z is a shock jock, mostly. His readership is vast and touchy. They like to be provoked and react with speed and fury."
- Sam Rocha
"Father Z’s Blog is a bright star on a cloudy night."
"A cross between Kung Fu Panda and Wolverine."
Fr. Z is officially a hybrid of Gandalf and Obi-Wan XD
Rev. John Zuhlsdorf, a scrappy blogger popular with the Catholic right.
- America Magazine
RC integralist who prays like an evangelical fundamentalist.
-Austen Ivereigh on Twitter
[T]he even more mainline Catholic Fr. Z. blog.
-Deus Ex Machina
“For me the saddest thing about Father Z’s blog is how cruel it is.... It’s astonishing to me that a priest could traffic in such cruelty and hatred.”
- Jesuit homosexualist James Martin to BuzzFeed
"Fr. Z's is one of the more cheerful blogs out there and he is careful about keeping the crazies out of his commboxes"
- Paul in comment at 1 Peter 5
"I am a Roman Catholic, in no small part, because of your blog.
I am a TLM-going Catholic, in no small part, because of your blog.
And I am in a state of grace today, in no small part, because of your blog."
- Tom in comment
"Thank you for the delightful and edifying omnibus that is your blog."- Reader comment.
"Fr. Z disgraces his priesthood as a grifter, a liar, and a bully. - - Mark Shea
Dui bu qi. No I don’t. :-(
I imagine it’s an encoding issue. The encoding that pages on the blog are being served with is currently UTF-8; I don’t know what it might have been before.
In firefox, at least, you should be able to go to one of the damaged pages, go to View > Character Encoding from the menu, and try different encodings until you find the right one (though you will need to know what one or two of the characters are supposed to be to make sure you’ve got the right one).
At that point you should be able to copy the text out and paste it into an editing form in a separate tab/window that is still set to UTF-8 in order to convert the text.
Hm, unfortunately it looks like the text of entries may actually be corrupted in the database and not simply transmitted with the wrong encoding, though it wouldn’t hurt to get a second opinion.
If the pages in question still do not display correctly with either DTF-8 or GB3212 then the flat files themselves have been corrupted and you are lost.
Let me try to type some Chinese here….
O! It didn’t work! I typed “Father” in Chinese but “??” appeared instead.
Fr. Ho: Yes… I know. I will see what can be done at the level of the server.
Here’s the problem: the original UTF-8 Chinese text was misinterpreted as though it were in Windows encoding (CP1256) and then “converted” into UTF-8. This happened *twice*. To recover the Chinese text, apply the reverse transformation. I only know how to do this under Unix using uconv (part of the ICU package from IBM):
cat garbled-file | uconv -f utf8 -t cp1252 | uconv -f utf8 -t cp1252
David C: I am not sure I entirely understand that, but I can pass that along.
Father, it’s as if someone found a scrap of paper that said “sex dies”, mistakenly thought it was in English, and translated it into Latin “sexus moritur”.
This must by why your Korean text has also been mangled for some time.
Hooray for Windows!
Apologies, I wrote “CP1256” above when I should have written “CP1252.” Father, if the recipe above doesn’t work for your admin, I’d be happy to ungarble the files if they can be sent to me conveniently.
I can confirm that David’s recipe works (with the substitution of cp1252), even for sequences of Chinese text copy-and-pasted directly from the blog. However, it doesn’t always appear to work when there are runs of English text interspersed — it looks like the runs of Chinese text may need to be converted individually, and a handful of them appear to have been irreversably mangled.
Here is the recovered text (using David’s method) from Father’s post of a couple years ago:
(We’ll see if this actually shows up correctly in the comments section — if it does, Father Ho’s problem may be that his browser is using an encoding other than UTF-8 for posting.)
Nope. Evidently some things are still(?) misconfigured on the server side with respect to character encodings. Oh well.
Just out of curiosity: how often exactly do you post in Chinese, Father?
Catholic terms in Chinese: Catholic – ???; Father – ??; Nun – ??; Sacrament of Confession – ????.
Chinese characters do not work. Now I will try Korean. Catholic words in Korean: Catholic – ???; Father – ??; Sister – ??; Sacrament of Confession – ????
Hm, let’s try accented european characters: áéíóúýñ àèìòù?ñ
It looks like, for posted comments, any character not covered in cp1252 is getting smashed into a ?. If “smart quotes” work, that pretty much confirms it. The really maddening thing is that all these characters show up fine in previews.
Yep. New comments are being shoehorned into cp1252 encoding, losing most non-European characters (and some European ones).
MenTaLguY: but the recovered characters displayed correctly before going into the comments system, yes?
MenTaLguY: sorry, I missed one of your comments which answers my question. Can you point me to some posts that don’t convert correctly? Thanks.
I am Chinese and I hope I can help you.
But where are the passages? Perhaps you can send me them by email. And I will try to get them back into Chinese Characters, if they are not corrupted, they can be converted again.
If you can tell me the English meaning of the passages, I can translate them back into Chinese and type the text and send you by email.
And I will be happy to be able to help you in these kind of issues later too.
Just write me an email.
I am sending you an email to tell you my email address.
Just as a test:
???? ???, ??? ??? ?? ???????. (Russian)
????? ????, ? ?? ???? ????????. (Greek)
O?e naš, koji si na nebesima. (Serbian, Latin characters)
Ojcze nasz, który? jest w niebe. (Polish)
Pater noster, qui es in caelis. (Latin)
All of them appear correctly in the preview. Now for the post …
David: yes, the recovered comments show up correctly via preview. The post I linked to from which I recovered text actually doesn’t convert completely (try the title), at least partly due to the presence of some 8-bit characters (i.e. smart quotes). Also try some of the comments on that same post. In some cases it looks like the transformation is not reversible.
(Because it appears that the transformation is lossy, I’d definitely recommend going over any recovered text with a chinese speaker to make sure it hasn’t been mangled to say something weird…)
Roland: to save yourself some trouble, any it appears that any character not on this chart won’t work.
In any case (this is a _guess_; I’m not privy to the technical details of the site), it looks like one of the things which will need to be done is to change the database encoding from Microsoft cp1252, which can’t represent most international characters, to UTF-8. This will involve converting all of the data there from cp1252 to UTF-8. At that point it’s just a matter of making sure that WordPress is storing things with the correct encoding for the database (which it should do automatically, I think, but it wouldn’t hurt to check). That should take care of addressing problems for new posts and comments.
As far as the older posts, it would probably be a good idea to save the unconverted database, as converting the database to support international characters might mangle the existing mangled data further.
I suspect the reason that things used to work is that the software was simply passing things through without paying attention to character encodings, so things went in and out unconverted and usually happened to work. Then at some point updates to the infrastructure meant that parts of the system started paying proper attention to character encodings, which unfortunately meant that (since different parts of the system disagreed about character encodings, and data had already been stored in mismatched encodings) things started getting mangled.
This is useful for me, folks. Thanks.
Father, I’ve sent you an email with a hopefully robust solution, a script that attempts to circumvent the problems Mentalguy has pointed out. I absolutely agree that you should save a backup copy!