存档

文章标签 ‘geeky’

Unicode真的是一个非常有趣的东西……

2007年9月19日 没有评论

原来Unicode当中还有这种字符:U+0489


Ò‰

看不到的话,估计是你的浏览器或者系统对于Unicode的支持还不够好,所以这个Unicode字符看不到。

OK,然后我们看看更有趣的:

先看看这篇文章:
http://www.tipotheday.com/2007/08/26/wtf-is-this-character/

然后打开这个链接看看你的网页和标题栏:
http://www.google.com/search?hl=en&q=%E2%80%AB%E2%80%AC%E2%80%AD%E2%80%AE%E2%80%AA%E2%80%AB%E2%80%AC%E2%80%AD%E2%80%AB%E2%80%AC%E2%80%AD%E2%80%AE%E2%80%AA%E2%80%AB%E2%80%AC%E2%80%AD%E2%80%AE%D2%89language+&btnG=Search

解释在这里:
http://en.wikipedia.org/wiki/Unicode_control_characters#Bidirectional_text_control
照顾不能上wikipedia的朋友们,贴过来:

Bidirectional text control


Unicode supports standard bidirectional text without any special characters. In other words Unicode conforming software should display right-to-left characters such as Hebrew letters as right-to-left simply from the properties of those characters. Similarly, the Unicode handles the mixture of left-to-right-text alongside right-to-left text without any special characters. For example, one can quote Arabic (“بسملة”) right alongside English and the Arabic letters will flow from right-to-left and the Latin letters left-to-right.. However, support for bidirectional text becomes more complicated when text flowing in opposite directions is embedded hierarchically. So that for example if one quotes an Arabic phrase that in turn quotes an English phrase. Other situations may complicate this when for example, an author wants the left-to-right characters overridden so that they to flow from right-to-left. While these situations are fairly rare, Unicode provides seven characters (U+200E, U+200F, U+202A, U+202B, U+202C, U+202D, U+202E) to help control these embedded bidirectional text levels up to 61 levels deep.

实际上在这两个例子当中,一圈逗号的字符只是一个幌子,真正起作用的是U+202B – U+202E一系列的转义字符,只是它们都是不可显示的,所以需要用一个幌子来让你可以用来拷贝。

更有趣的是,如果你在上面两个网页当中任何一个打开源代码看看,保证你会疯掉:源代码也已经反过来了……

但是,实际上,数据仍然是按照正常的顺序存在的,问题出在文本的渲染上面。

那么,如果所有的编辑器都能够follow Unicode的标准,那么我们怎么才能看到真正顺序的文本?这似乎成了一个悖论。

再感叹一下:I18N真的是个复杂问题……

标签: ,

From: Linus Torvalds linux-foundation.org>

2007年9月7日 1 条评论

http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=57918


From: Linus Torvalds <torvalds <at> linux-foundation.org>
Subject: Re: [RFC] Convert builin-mailinfo.c to use The Better String Library.
Newsgroups: gmane.comp.version-control.git
Date: 2007-09-06 17:50:28 GMT
(19 hours and 46 minutes ago)
On Wed, 5 Sep 2007, Dmitry Kakurin wrote:
>
> When I first looked at Git source code two things struck me as odd:
> 1. Pure C as opposed to C++. No idea why. Please don’t talk about portability,
> it’s BS.

*YOU* are full of bullshit.

C++ is a horrible language. It’s made more horrible by the fact that a lot
of substandard programmers use it, to the point where it’s much much
easier to generate total and utter crap with it. Quite frankly, even if
the choice of C were to do *nothing* but keep the C++ programmers out,
that in itself would be a huge reason to use C.

In other words: the choice of C is the only sane choice. I know Miles
Bader jokingly said "to piss you off", but it’s actually true. I’ve come
to the conclusion that any programmer that would prefer the project to be
in C++ over C is likely a programmer that I really *would* prefer to piss
off, so that he doesn’t come and screw up any project I’m involved with.

C++ leads to really really bad design choices. You invariably start using
the "nice" library features of the language like STL and Boost and other
total and utter crap, that may "help" you program, but causes:

– infinite amounts of pain when they don’t work (and anybody who tells me
that STL and especially Boost are stable and portable is just so full
of BS that it’s not even funny)

– inefficient abstracted programming models where two years down the road
you notice that some abstraction wasn’t very efficient, but now all
your code depends on all the nice object models around it, and you
cannot fix it without rewriting your app.

In other words, the only way to do good, efficient, and system-level and
portable C++ ends up to limit yourself to all the things that are
basically available in C. And limiting your project to C means that people
don’t screw that up, and also means that you get a lot of programmers that
do actually understand low-level issues and don’t screw things up with any
idiotic "object model" crap.

So I’m sorry, but for something like git, where efficiency was a primary
objective, the "advantages" of C++ is just a huge mistake. The fact that
we also piss off people who cannot see that is just a big additional
advantage.

If you want a VCS that is written in C++, go play with Monotone. Really.
They use a "real database". They use "nice object-oriented libraries".
They use "nice C++ abstractions". And quite frankly, as a result of all
these design decisions that sound so appealing to some CS people, the end
result is a horrible and unmaintainable mess.

But I’m sure you’d like it more than git.

Linus

Very interesting, also boring...

人和人是不一样的,如果不喜欢某项东西又无法改变,至少你有用脚投票的权利。
或者,你有用手投票的权利,如果C++更好,就用C++做出来一个比Git更加成功的东西,
怎么都比这种无意义的争吵要好。

其实,newsmth上面的坑比这个还有意思。适合茶余饭后消遣一下。
标签: ,