cpploverr - C++博客

创建DLL动态连接库

创建DLL动态连接库

     Windows 的执行文件可以划分为两种形式程序和动态连接库（DLLs）。一般程序运行是用.EXE文件，但应用程序有时也可以调用存储在DLL 中的函数。
     当我们调用Windows 中的API 函数的时候，实际上就是调用存储在DLL 中的函数。
     在如下几种情况下，调用DLL 是合理的：
     1）不同的程序使用相同的DLL ，这样只需要将DLL 在内存中装载一次，节省了内存的开销。
     2）当某些内容需要升级的时候，如果使用DLL 只需要改变DLL 就可以了，而不需要把整个程序都进行变动。
     3）由于DLL 是独立于语言的，所以，当不同语言习惯的人共同开发一个大型项目的时候，使用DLL 便于程序系统的交流，当然，Delphi开发的DLL 也可以在诸如Visual BASIC，C++ 等系
统中使用。
     下面通过几个例子，说明Delphi开发动态连接库的方法和规范。
     第一节动态连接库的构建和调用方法
     一、动态连接库构建
     File---New---Other---DLL Wizard
     这就创建了一个动态连接库的基本模块
     library Project2;
     uses
       SysUtils,
       Classes;
    {$R *.res}
     begin
     end.
     把工程名改为Mydll，并写入必要的函数
     library mydll;
     uses
       SysUtils,Classes,Dialogs,windows;
     function Triple(N:Integer):integer;stdcall;
     begin
       result:=N+3;
     end;
     function Double(N:Integer):integer;stdcall;
     begin
       result:=N+2;
     end;
     function Triple1(N:Integer):integer;stdcall;
     begin
       showmessage('计算N+3');
       result:=N+3;
     end;
     function Double1(N:Integer):integer;stdcall;
     begin
       messagebox(0,'计算N+2','计算N+2',mb_ok);
       result:=N+2;
     end;
    exports
      Triple name 'Tr',
      Double name 'Do',
      Triple1 name 'TrM',
      Double1 name 'DoM';
     Triple,Double,Triple1,Double1;
   {$R *.RES}
   begin
   end.
     其中函数：Triple：把传入值加三
               Double：把传入值加二
               Triple1：把传入值加三并显示提示
               Double1：把传入值加二并显示提示
     从这个例子中可以看出DLL 程序的几个规则：
     1）在DLL 程序中，输出函数必须被声明为stdcall，以使用标准的Win32 参数传递技术来代替优化的Register。
     （说明：在Delphi中Register方式是缺省的调用约定，这个约定尽量采用寄存器来传递参数，传递次序从左到右，最多可用到3个CPU 的寄存器，如果参数多于3 个，剩下的就通过栈来传送，使用寄存器传送可保证参数传递的速度最快。
     而stdcall 方式是通过Windows 的标准调用来传递参数，传递秩序从左到右，这种方式适合调用Windows 的API ，在DLL 中，当然要使用这种方式）。
     2）所有的输出函数都必须列在exports子句下面，这使的子例程在DLL外部就可以看到。
    exports
      Triple name 'Tr',
      Double name 'Do',
      Triple1 name 'TrM',
      Double1 name 'DoM';

     列出了用户使用这个函数的接口名字。虽然别名不是必须的，但最好给个别名，以便用户程序更容易找到这个函数，同时还要指出，Delphi 6.0取消了Delphi 5.0中允许使用的index ，如果还用Index来指明接口名字，Delphi 6.0中将提示错误。
     实例中给出了两种提示方法，主要想说明一个问题：
     showmessage('')，是VCL 提供的函数，由于多次编译VCL，做出的程序会比较大。
     而messagebox(0,'','',mb_ok)   是Windows提供的API 函数，做出的程序会比较小。
     这就是说，编写DLL 程序的时候，要尽量避免多次编译VCL 。作为一个实例，这里把两种方法都列出来了。
     保存
     编译：Projrct---Build Mydll
     这就完成了一个简单的动态连接库的编写。
     二、动态连接库的调用
     首先在implementation下做调用声明
const
   gdi32='mydll.dll';
function triple(n:integer):integer;stdcall;external gdi32 name 'Tr';
function Double(N:Integer):integer;stdcall;external gdi32 name 'Do';
function triple1(n:integer):integer;stdcall;external gdi32 name 'TrM';
function Double1(N:Integer):integer;stdcall;external gdi32 name 'DoM';
     以后程序中就可以作为普通的函数使用了，例如：
procedure TForm1.Button1Click(Sender: TObject);
var N:integer;
begin
   N:=updown1.position;
   edit1.text:=inttostr(triple(N));
end;
     第二节 DLL 中的Delphi窗体
     一、在DLL 中放置窗的的方法
     在DLL 中，除了放置标准的函数和过程以外，也可以放置已经做好的的delphi窗体，也可以把做好的窗体供其它程序使用，方法是：
    1）首先按普通方法制作窗体，不过在interface区域，对接口函数做如下声明
    function Createform(capt:string):string;stdcall;

    2）在implementation下加入接口函数
function Createform(capt:string):string;stdcall;
var   Form1: TForm1;
begin
   form1:=Tform1.Create(application);
   form1.show;
   form1.caption:=capt;
end;

   3）制作DLL 动态连接库，但要声明：
uses
   unit1 in 'unit1.pas';
exports
{写入接口标示符}
Createform name 'Myform';
   4）调用窗体的程序按普通方法制作，但是在implementation下首先声明要调用的DLL函数
const
   gdi32='myFormdll.dll';
   function Createform(capt:string):string;stdcall;external gdi32 name 'Myform';

procedure TForm3.Button1Click(Sender: TObject);
var n,m:string;
begin
   m:='我的窗体';
   Createform(m);
end;
     二、DLL 中的调用窗体时的数据传递
     在窗体调用时，可以用普通的函数方法传递数据，下面举个例子。
     1）建立窗体
     做一个改变颜色窗体，放在DLL 中，可以用普通的方法来做，但要作如下声明：
     function mycolor(col:longint):longint;stdcall;
     function Getcolor:longint;stdcall;
     其中，mycolor为构造窗体；Getcolor为传递颜色数据。
     在implementation区声明一个窗体内全局的变量
     var color1:longint;
     下面写出相应的程序
function mycolor(col:longint):longint;stdcall;
var   Form1: TForm1;
begin
   form1:=Tform1.Create(application);
   form1.show;
   form1.panel1.Color:=col;
   form1.edit1.Text:=inttostr(form1.panel1.Color);
   result:=color1;
end;

function Getcolor:longint;stdcall;
begin
result:=color1;
end;

procedure TForm1.ScrollBar1Change(Sender: TObject);
begin
   panel2.Color:=RGB(ScrollBar1.Position,ScrollBar2.Position,ScrollBar3.Position);
   edit2.Text:=inttostr(panel2.Color);
   color1:=panel2.Color;
end;
procedure TForm1.Button2Click(Sender: TObject);
begin
    Free;   //析构Form1
end;
     2）建立动态连接库
     运行成功后，再建立动态连接库：
library FormDLL;
{从文件调入}
uses
   unit1 in 'unit1.pas';
exports
{写入接口标示符}
Mycolor name 'My',
Getcolor name 'Get';
begin
end.
     3）建立调用的程序
     首先声明要调用的DLL函数
const
   gdi32='formDll.dll';
   function Mycolor(col:longint):longint;stdcall;external gdi32 name 'My';
   function Getcolor:longint;stdcall;external gdi32 name 'Get';
     然后写出相应的程序
procedure TForm1.Button1Click(Sender: TObject);
begin
   Mycolor(color);
end;
procedure TForm1.Button2Click(Sender: TObject);
begin
    color:=getcolor;
end;

---
本文章使用“国华软件”出品的博客内容离线管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-18 08:23 cpploverr 阅读(203) | 评论 (0) | 编辑收藏

Delphi中Format字符串说明

function Format(const Format: string; const Args: array of const): string;
Format字符串说明：
"%" [index ":"] ["-"] [width] ["." prec] type
(1) 格式化字符串必须以%开头
(2) [index ":"] 索引指的是Args参数列表中要显示的每一项的序号。比如：Args是
                  ['a', 'c']，那么'a'的索引就是0，而'c'的索引就是1，而且由于只有
                  两项，所以就不会出现大于1的索引值。
                  Format('%2:s %1:s %0:s', ['1st', '2nd', '3rd']);
                  结果：'3rd 2nd 1st'
(3) ["-"] 这个标识符的作用是当要显示的字符的个数少于[width]时，在右边填补空格；
            如果没加上["-"]，则在左边填补空格。
            Format('(%4s)', ['aa']); 结果：' aa'
(4) [width] 宽度
              规定了要显示的字符的个数。如果要显示的宽度大于[width]，则按实际的
              宽度来显示；反之，则填补空格或按要求填补其它字符。
(5) ["." prec] 精度
                 这是针对浮点数来说的，一般就是指小数点后的位数。
(6) type 类型(见下面)
type的可能值有下列这些：
(1) d 有符号十进制数
        Args必须是有符号整型数。如果在格式化字符串中还加入了["." prec]，则如果Args
        的长度如果小于给出的精度数时，在前边填补0；如果大于精度数，按实际长度显示。
        Format('(%.3d)', [99]); 结果：'(099)'
(2) u 无符号十进制数
        Args必须是无符号整型数。其它特性与d一样。
(3) e 科学技术法
        用科学技术法显示数据，形式大致如下：'-d.ddd...E+ddd'。
        Args必须是一个浮点数。如果是一个负数，则在最前面显示一个符号；在小数点前面
        总是显示一位数字；包括小数点前面的数字在内，数字的个数由["." prec]来确定，
        如果没有指定["." prec]，则默认为15位精度。如果实际的数字长度超出了指定的
        ["." prec]，则刚刚超出的那一位数字四舍五入。指数符号E后面总是要跟着加号或
        减号，并且在后面至少跟着三位数字。
(4) f 固定的
        Args必须为浮点数，转换后的形式大致是'-ddd.ddd...'这样的。
        如果要转换的是负值，则前面有一个负号。转换后的数字，在小数点后面的数字的个数
        由["." prec]决定。如果没有指定["." prec]，默认为2位精度。
(5) g 一般的
        Args必须为浮点数。
        被转换后的数字总是尽可能的简短(有可能是f或e形式的)。有重要意义的数字的长度
        由["." prec]来决定，默认为15位(包括整数位和小数位)。数字前后的0都将被去掉，
        小数点也只有在必要的时候才显示出来。如果小数点左边的数字小于等于指定的精度，
        并且整个值大于或等于0.00001的时候，才使用f的显示格式，否则使用e(科学技术法)
(6) n Args必须是浮点数。形式和f是一样的，不同的是会显示千位符，如：1,123,444
(7) m 货币类型
        Args必须是浮点数。能够显示货币符号，可以通过“控制面板”来设置。小数点后
        的位数由["." prec]决定，如果没用["." prec]，则默认2位。
(8) p 指针
        Args必须是一个指针值。
        将指针转换为8个字符的十六进制字符串。
(9) s 字符串
        Args必须是字符，字符串或PChar值。
        如果指定了["." prec]，并且字符串的实际长度大于["." prec]，则从左到右截取
        精度指定数量的字符串，其余的删除。
(10) x 十六进制
         Args必须是一个整型数。
         如果使用了["." prec]，不足部分要用0补齐。
注意：[index ":"] [width] ["." prec]可以使用这样的格式：
        Format('%*.*f', [8, 2, 123.456])
        等价于：Format('%8.2f', [123.456]).

---
本文章使用“国华软件”出品的博客内容离线管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-15 09:50 cpploverr 阅读(959) | 评论 (0) | 编辑收藏

delphi中国际化的几种方案及比较(转)

随着全球化程度加深，软件越来越像蒲公英，到处飘散、扎根。这其中要解决的是不同语言的显示问题。我们当然希望一套程序，可以不修改代码就可以支持不同的语言，不要去维护很多的版本。

首先要谈到的一个问题是乱码问题，因为delphi win32到11.x版还是不支持unicode，所以一般使用Ansi码，有这样几种情况会显示乱码：

使用的语言文字与系统当前设定的语言不一样；比如简体版QQ在繁体操作系统（或简体操作系统的区域设置中“非Unicode程序的语言”设定为繁体）就是乱码。即使改变Font.Charset，某些元件仍然会出现乱码，如StatusBar。因此，在越南文版的windows显示越南文，在伊朗文版的windows显示伊朗文，不要在越南文版windows显示伊朗文，在伊朗文版windows显示越南文，这样就能确保没有乱码问题。好在一般这样的错位用法也不多见。
系统没有安装你要显示的语言的语言包；
如果你要保证完全无乱码，必须考虑使用unicode码，使用成套的支持unicode的元件，如tnt，但它在UI变现上比较单一，你不可能不使用别的元件。

        言归正传，首先，看看哪些地方的字串需要实现多语言，并来看看各种实现方法的优劣。
            1、界面上的元件，如TButton的Caption；
            2、主动弹出的消息，如ShowMessage('Are you sure?')，Raise Exception.Create('Error!')；
            3、例外错误举发的报告信息，如f/0引起的exception；
            4、第3方元件包内部的上述字串；

        实现多语言的方法很多，列举一二：
            1、delphi自带的Resource生成工具
                此工具把专案的dfm文件里的所有字串以及pas中定义为ResourceString的字串列举出来，按不同的语言编译成不同的Resource，专案编译前先选语言，每种语言编译成一个exe。

这个工具使用很不方便，不是一个完整的解决方案，跟Borland的Midas的demo一样（TClientDataSet通过ProviderName连接到RemoteDatamodule的TDataSetProvider，实际开发Erp系统时，谁会放100个TDataSetProvider连接到100个TDataSet？），只是一个原理尚通的示范。

首先，由于dfm本身也是资源文件的一部分，因此每次修改都要“Update Resource DLLs...”，如修改Button1为Button2，如果你忘了，运行时就会报“找不到资源Button1的错误”；提供的字典编辑画面中，出了字串，还有Left/Top等资料；字典不能重用，在一个模组翻译了，在第2个模组还要再翻译相同的词。

其次，每种语言一个exe/bpl，如果你的系统是Package切割，bpl也是每种语言一个，还要小心别把不同语言的bpl组合在了一起，到时候一个画面显示中文，一个显示德文（有一个可能是乱码）就惨了。

再次，在作bpl组装的系统时，第3方元件如果没有提供多语言的方案，你就需要修改第3方元件，但一般我们不这样干，因为第3方元件会随时更新，难道每次人家更新你也再更新人家。

因此，一般都没有人使用Delphi本身提供的这个方案（除了作demo）。

2、Resource dll方式

               用单独的ResourceDll，用LoadResString等函数获得翻译字串，但你要到处写这个函数来一一替换，特别是Form上的字串，噢，会累死人。字典可以重用。
            3、网上讨论很多的ini文件方式
                此方法是写个替换的引擎，在运行时从ini文件读取语言字串来替换画面元件的显示文字。这个方法比第一种进步很多，不需要每种语言编译一个exe了，只要提供不同的ini文件就好；画面修改时如果ini没有同步更新也不会出现致命错误，最多就是某个文字没有转换；引擎也提供了字串转换函数，因此也可以处理主动弹出的消息。这个方法在文件格式上有三种不同的实现：
                (1)、[编号]=[字串]
                      每个字串从1开始编号，1，2，3，4......，很麻烦，代码要修改，当然运行时切换语言没问题。
                (2)、[元件.属性]=[字串]
                      这种实现把元件instance一一对应，用RTTI来判断属性，替换很精确，也可以运行时切换语言。不足之处是，略显呆板，多个元件相同的字串会多次列出；没有扩展性，表现TListView的Columns等复杂元件时比较吃力。
                (3)、[旧字串]=[新字串]
                      不管元件的instance，ini是纯粹的语言对照表，或者叫字典，扩展性、运行时切换语言可能在引擎里。不足之处是不能处理一词多义。

总的说来，这种方式有很大进步，但为了用ini文件，大家还要费力的破解64k的限制，更专业的方式是使用自定义的文件格式。
在简单性方面，无疑是这种自定义的转换引擎，[旧字串]=[新字串]的文件格式来得方便，借助字典管理工具，字典文件可以重复使用，也可以提供给专业翻译公司翻译。那么剩下的问题在引擎上，如何方便，最好用户不写一行代码；如何扩展性强，支持任意的第三方元件；如何有弹性，同一个画面有多种语言的文字，同一个词可以转换成不同的意思......

4、给每个元件类继承一个子类，在子类的Loaded方法里转换文字。由于要处理的都是叶级元件（虽然TLabel、TPanel都是从TCustomControl来，但不能只处理TCustomControl），工作量比较大；对旧有程序除了换元件无能为力。

5、为每个元件类注册一个转换函数，引擎遍历Container，为每个元件找到血源最近的转换函数，调用这个函数转换这个元件的文字。这样可以不必处理叶结点，只需在恰当的元件层上注册函数；不必改动旧有程序。設計時Form上只需要放一個轉換元件，這個元件在Loaded后開始掃描Form上的元件，從for I:=0 to ComponentCount-1或從for i:=0 to ControlCount-1遞歸，找到一個元件就去查找其血緣最近的註冊函數，然後調用這個函數替換其文字。因爲註冊函數是額外加上去的，所以不會動到舊的代碼，對任意第方元件都可以擴展支持，且也不用去修改人家第3方元件的代碼。

我认为第5种方法很优雅，看起来比较干净。用GOF的设计模式来套，这属于Mediator pattern(中介者模式)。多年前，我们使用一个叫TXPMenu的元件来获得XP风格的界面，也是感觉到它很干净，一个元件就搞定一切，不用TLabel换成TFlatLabel,TButton换成TFlatButton......我记得《程序员》上还有文章专门称赞这个元件。但那个元件没有使用中介者模式，不能很好的扩展对第3方元件的支持。

最后，我们畅想一下，如果我是Borland，如何在Delphi里完整支援多语言。Delphi提供了一个区块定义的关键字“ResourceString”，在这个区块定义的字串常量，编译器会把它编译在exe文件的资源区，运行时用LoadStringA这个Windows API来读取，因此有些外部转换工具可以直接从exe文件读取这些资源字串，再写入转换后的字串；内嵌的转换引擎也可以拦截这个API函数来转换文字。但是如果exe里的字串资源化不彻底，就无能为力，这个不彻底恰恰来自Delphi的DFM文件，Delphi把DFM文件整个作为一项资源放在exe里，其上的字串就没法决定是否要don`t resource（Delphi源码里很多常量字串都有这个提示）了。

如果除了string,widestring,ansistring等等这些数据类型，delphi增加一种数据类型multistring，然后修改vcl元件定义（拜托Borland连同Unicode一起解决了吧），像TLabel.Caption定义成MultiString，对MultiString类型，有一种专门的处理方法，如类似ResourceString用LoadString API来处理，每次读取就转换一次，但应该比这个内容更多，比如要传出instance，然后提供一个全局的ApplicationMulti元件，类似ApplicationEvent，让外面能捕捉到。至于字典，只能外部用户提供（当然可以制定一个标准格式让delphi人都可以共享交换）。

此法看起来可行，但还有个效率问题要考虑，（1）每次读取都转换，对频繁draw的东西效率低；（2）比如一个ToolBar有好多的ToolButton，批次更新时一般都会用BeginUpdate/EndUpdate，vcl如何告知后代来提高这种效率。（补记：效率看起来不是问题，对多次字串更改导致频繁draw，其实元件自己已经会用beginupdate/endupdate处理，外部不会涉及）

---
本文章使用“国华软件”出品的博客内容离线管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-14 07:47 cpploverr 阅读(804) | 评论 (1) | 编辑收藏

ResourceString与国际化(转)

如果您写的软件需要考虑到转换成为不同语言，那么由Borland C++ Builder EnterpriseEdition 所提供给您的多国语言翻译环境将是极为有帮助的。他可以很快的帮助您将各国语言版本制作出来。并且也提供动态的方式让您可以轻易的制作出您所需要的版本。甚至可以让您在不用变动程序代码本身的状态下，将GUI 的外观字型等进行适度的调整。除了多国语言接口的制作外，透过本文后面所叙述的技巧，将可以让您动态的切换不同的语系。这些BCB 都已经帮您处理得很好了，不过有些小细节您仍然需要注意，小心的处理或是避开，否则很容易的做不出来您想要的结果。

首先，我们先来看一下在设计程序时要注意的东西。

第一个版本务必使用英文。这点非常重要，其理由有二。第一点是通常比较好的文字翻译最好是由该国自己翻译。举例来说，要制作日文版本，当然是请地道的日本人来翻译会好的多。那么你认为懂得英文的日本人多，还是懂得中文的日本人多？同理，换成韩文，法文与西班牙文也是一样。第二点是由于在处理上， BCB 所提供的环境通常比较能够正确的支持英文及该平台的语系。也就是说，您可以很容易找到可同时显示及处理英文及日文的平台，或者是英文及法文的平台。因为，没有一个语系的平台不能处理英文。可是你比较难找到一个可同时处理中文及日文的平台，或是同时处理中文及法文的平台。例外，还有一个理由未列在上面的是，在大部分的状况下，英文的句子长度比较长。

不要强制设定任何组件及Form的预设字形。这是因为每个平台系统所使用的预设字型及大小都有所不同。像是英文操作系统中预设字型是 Tohoma，但是中文及日文的windows操作系统就不是这样。使用的语系一定要使用正确语系的字型才有办法显示。所以，使用预设字形，可让AP字型随着系统自动调整为正确字形。或酗ㄗㄠo比较好看( 大部分非英语系Windows 操作系统的预设英文字型都很难看，不知理由为何？ )，但至少是正确的。要做的这点很简单，确定您所有组件的属性中，只要有ParentFont这项的一定要强制设为true。尤其是Form的 ParentFont一定要设为true。否则你一定会在多国语言版本的过程中遇到big trouble 。相信我！否则我就不会写这篇文章来提醒你了。

别把要显示的字符串写死在程序里。像是这样： ShowMessage("Hello World!");或是 MainForm->Caption = "Main Form" ; 理由很简单，写在程序里的字符串Borland C++ Builder的工具无法帮你处理他们，这样在翻译时，就得自己把所有的Source Code搜寻一遍找出这些字符串。平常，随便写个小程序，这样的字符串就已经有十来个，您想想看，要是稍微大点的Project，这会有多恐怖？所以，我们要采用一个集中的方法，这个方法还要能够与Borland C++ Builder的翻译工具相结合。或许您之前看过我写的那个使用Resource File 的StringTable的文章。但是，我要告诉你，忘了它吧。我在写那篇文章的时候没注意到Borland C++ Builder无法使用那个方法。也不是说那个方法是C++ Builder完全不能用，而是，如果您想要动态进行各国语系的切换，请放弃使用Resouce File的StringTable。不过，就算不为这个理由，我也是建议你换成本文后面要介绍的Resourcestring的方式，理由是他比较方便使用。也比较像是C++ Builder的作法。不过，这个作法的缺点有二，一为要使用Pascal的语法。别担心，保证不要一分钟就学会。二是只有C++ Builder跟Delphi可用。其它的Compiler像是MS VC++就不能用啰。

写程序时，无论任何理由都不要让自己的程序绑死在某个语系上面。例如：你可能会想说要好看一点所以写成"92年2月2日"。可是，这样的东西在其它语言马上变成乱码。因此，基本的方式是使用国际通用的格式。当然最好的方式还是要能自动转换成为各国的常用表示方法。但是这个部分属于i10n的范畴，我还没研究过，有空再说啰。

现在，就让我们来看看这个Delphi语法的ResourceString要怎么用在C++ Builder中。其实，在多国语言上，Borland已经准备好了一个范例，放在$(BCB)\ExamplesAppsRichEdit。这个范例很重要的，他里面有些程序是我们后面介绍动态变换语系时要借来用用的。不过，这边我们先借他的ReConst.pas这个档案来用一下。您会看到这个档案有下面的内容：

unit ReConst;interfaceuses Windows;const ENGLISH = (SUBLANG_ENGLISH_US shl 10) or LANG_ENGLISH; GERMAN = (SUBLANG_GERMAN shl 10) or LANG_GERMAN; FRENCH = (SUBLANG_FRENCH shl 10) or LANG_FRENCH;resourcestring SUntitled = 'Untitled'; SPercent_s = '%s - %s'; SSaveChanges = 'Save Changes?'; SConfirmation = 'Confirmation'; SNumberbetween = 'The number must be between 1 and 1638.'; SLcid = '1033'; { US = 1033; DE = 1031; FR = 1036 }implementationend.看到那个resourcestring了没？没错，就是这样！只要把那些resourcestring中的字符串换成你自己的定义就可以啰。稍微要注意的是这个字符串是使用单引号来括住的，因为他是Pascal语法嘛！例如，您可定义自己的字符串为：

resourcestring MyResString1 = 'My First resource string' ;你可能觉得SLcid那行有点看不懂，别担心！在那行的分号后面的字符串全是批注。因为，在Pascal中，大括号被当成批注符号使用。您也可以如法炮制的来为您的字符串加上批注。另外，const中定义的是常数，其型态为Word，也就是2 Bytes。值介于0-65535之间。所以，建议您一并将程序内用到的常数定义在此，这样比较符合集中管理的精神。最后，别忘了将第一行的Unit叙述换掉。不过，您要先为您自己的档案命名。建议您学ReConst.pas 的命名方式。例如，您的Project缩写是My，那么就叫他MyConst.pas吧。所以，"Unit ReConst;"也要改成"Unit MyConst;"才行啰。好吧，完成后，请将这个pas 档案加入到你的Project 中。OK！光是这样是没用的。因为，C++ 需要有include 档案。那么，MyConst.pas的include档案从何而来？还好，C++ Builder有自动产生include 档案的能力，请在打开MyConst.pas后，按下Alt-F9，让C++ Builder单独compile它。成左尔陧A您就可看到MyConst.hpp这个档案的出现。理论上，您该在所有的cpp 档案中都会用到这个include档案才对。另外，别把这个include档案跟precompiledheader的技巧合用喔。因为他会为常数进行初始化，所以precompiled header技巧遇到他就做不出来了。请个别在需要的程序文件中include这个header file就可以了。 OK! 接下来要怎么在程序内使用这些常数及Resource字符串呢？非常简单！！用就对了！举例来说，您要使用MyResString1。您可使用下面的方式：

ShowMessage(MyConst_MyResString1);是的！在您要用的Resource string名称前面加个"MyConst_"的prefix。就可以了。当然，这个prefix完全是根据您Unit的定义而来。您可参考前面步骤所做出的include档案中，的namespace名称，就可看到啰。在了解了这个部分后，接下来就是要把你的程序写好啰。请注意，多国语言化的翻译务必在最后进行。而且是要在程序完成度已经相当高的时候才进行。否则，画面一下子变化太大的话，C++ Builder的多国语言环境也吃不消的。现在，假设您的程序已经完成了，接下来要如何才能够进行多国语言化呢？首先，您需要开启Resource DLL Wizard来制作Resource DLL。你可在主选单中依照New->Others->New->Resource DLL Wizard来开启它。接下来这个精灵程序会给您一些指示，并请您选择要制作的语系。

首先，您在精灵中遇到的第一个问题是精灵会询问你要对哪个Project 制作多国语言版本。把你要制作的Project 打勾。然后，继续后面的步骤。这边有一点要注意的是BCB 多国语言环境似乎有点问题，如果您的Project 放在一些中文目录中，可能会找不到，所以建议您的Project 要放在纯英文的路径内。接着，就是要选择您要转换的语系了。C++ Builder 的精灵会列出所有的语系让您选择。您可以一次把要做的语系全部选出来。也可以一次只选一些，下次要加入其它语系时，请重复这个步骤就可以了。别担心旧的语系会被局戚AC++ Builder 已经帮您处理好了。选好你的语系后按下下一步，您就会看到这些语系所对的目录。例如：中文就是CHT，简体是CHS，日文是JPN等等。接着，下一个画面，BCB 会问您有没有其它的要加入的处理的档案。如果要加入的应该都是dfm 檔或是rc档案。在接着下一步就是要产生语系了。C++ Builder 最后还会再列出一次要进行的语系给你看。

没问题的话，就按Finish啰。在最后列出所有语系的地方，有一栏为Update Mode 。如果您是第一次对这个语系建立Resource DLL，那么这个字段就是Create New 。如果您之前已经建立过这个语系了，而且您又再次使用Resource DLL来建立他。没问题，精灵会帮您处理好，所以您会看到Update Mode 中所显示的是Update。如果，您不想要只是Update而已，而是要重新建立新的，在Update上面点一下，一个combo box 会出现让您选择。最后都没问题的话，您就会看到精灵跟您说他需要重新编译您的程序以方便建立各国语言。在此同时，精灵会要求您将Project存成Project Group。这是因为每个语系都是自己一个Project 。用成Project Group才能统一管理。对于这个Project Group的名称我的建议是在Project名称后面加上"_i18n"的字尾。

当然，或许您会习惯在前面加上ML的前置词，代表Multiple Language。不过，我还是觉得用i18n好。存盘成功且Project 也重新Compile成功。这时，精灵会顺便帮您叫出翻译管理员(Translation Manager)。在这个Translation Manager中，您需要使用其中的Workspace 画面来帮助您进行翻译的工作。Translation Manager的用法不难，大家看看就会了。重点是除了字符串外，您还可透过翻译管理员调整组件大小，避免不同语系下，字符串被组件破坏的窘境。您随时都可透过主选单的View->Translation Manager来叫出翻译管理员。只是每次翻译完，您都需要针对有修改过的语系重新编译。每个单独语系都会编译出自己的Resource DLL 。这些档案的名字就是Project的名字，加上以语系的名字作扩展名。所以，如果您有一个名为Project1.exe的档案。那么他对应的中文繁体语系档就是Project1.CHT。日文就是Project1.JPN。这些语系档案要与执行档放置于同一个目录下。至于我们前面所说的resourcestring，他会出现在各语系的Resource script 中。您也是透过翻译管理员来编辑他们。

不过，在这里C++ Builder一直到6.0 update 2都有这么个Bug 。这个Bug 会导致您的resourcestring在切换语系时，不会被载入。好的，要如何解决他呢？简单的很，请开启你的Project Mananger，将每个语系的Project加入一个名为XXX_DRC.rc的档案。其中，XXX就是你的Project名称。这个档案在每个语系自己的目录下都有一个，您需要为每个语系的Project 加入他。当然，别弄错目录，把不同的语系档案混在一起啰。如果您要设定这个Project在目前平台要使用何种语系，可利用Project->Languages->Set Active设定预设语系。实际上，您的程序是根据下面这个Registry的值来决定要使用何种语系Resource DLL：

HKEY_CURRENT_USER\Software\Borland\Locales\[Exec file name]其中"Exec file name"就是你的执行文件名(含路径)，这个Registry的值是您希望的语系的名称。例如，如果您需要这个软件启动时是中文版，那么这个Registry就应该填CHT 。其它以此类推。由于，要显示哪个语系是透过上述的Registry所决定的，因此，您需要在安装程序中就正确的设定这个Registry的值。以便能够正确的显示正确的语系。或阴z要问，难道没有办法自动判断系统语系，然后进行切换吗？有的！这就是我们下面所要介绍的，不过我们要介绍的是远比自动判断语系更厉害的，也就是动态切换语系。让您的使用者，可以动态的切换所要显示的语系。嗯！厉害吧！好了，首先第一部要做的是把前面所提到C++ Builder 范例中的那个RichEdit目录找出来。接着，仔细找找，看下面是不是有个叫做reinit.pas的档案。有吧！没有的话，就去跟别人Copy吧。把这个档案放到你的Project里面来。然后在您要让使用者动态切换的地方加上由这个档案所产生的include档案。例如：

#include "reinit.hpp"OK! 第一次，你的系统里面当然没有这个档案啰！请利用Compile Unit的方式产生他。接着，我们就是要利用这个档案所提供的函式来帮助我们达到动态切换语系的效果。首先，假设您希望的是能够自动判断系统语系然后进行切换，那么找个合适的地方（通常是main function的开头。）加入下面的程序代码：

if(LoadNewResourceModule(GetSystemDefaultLCID())) { ReinitializeForms(); }这样，便可正确的显示语系啰。上面这几行，其实就是C++ Builder自己偷偷在用的。所以，如果你有提供Resource DLL，而且没有依照前面所说的在Registry中指定语系。那么你的程序就会自己依照抓取系统语系了。正因如此，我们只需要把GetSystemDefaultLCID()换掉，换成特定的语系就可以啰。所以，现在要做的就是如何制作出给特定语系LCID 呢？首先，在C++ Builder的include目录中，找到winnt.h 这个档案。找到LANG_ 及SUBLANG_开头的定义宏。我们以法文为例。要做出法文的LCID，其作法如下：

LCID lcid = (SUBLANG_FRENCH << 10) | LANG_FRENCH ;这样就可以了，很明显LANG_ 是在定义国家的语言。那么SUBLANG_呢？指的是地方语系。比如说，同样是中文，在LANG_CHINESE之下，就有这些SUBLANG：

#define SUBLANG_CHINESE_TRADITIONAL 0x01 // Chinese (Taiwan) #define SUBLANG_CHINESE_SIMPLIFIED 0x02 // Chinese (PR China) #define SUBLANG_CHINESE_HONGKONG 0x03 // Chinese (Hong Kong S.A.R., P.R.C.) #define SUBLANG_CHINESE_SINGAPORE 0x04 // Chinese (Singapore) #define SUBLANG_CHINESE_MACAU 0x05 // Chinese (Macau S.A.R.)全部使用上面的计算方式，就可组合出您要的LCID。然后再把他喂给LoadNewResourceModule()，就可加载正确的 Resource DLL了。前提是那个Resource DLL 档案要存在喔。到此为止，您应该已经能够掌握C++ Builder 中多国语言的作法啰。最后再提醒您一点，一旦您对于Form上面或是resourcestring有所新增，删除，或是其它外观上的变动。务必要用 Project->Languages->Update Resource DLLs来更新您的Resource DLL。并且重新编译出各语系的版本。千万不要让自己重复一直做这个工作，否则您会很累。

多国语言一定要等所有的画面都确定之后，甚至是经过相当程度测试之后再进行。也就是说，英文的 Release 版本没出来前，别做多国语言化。否则，一定会浪费掉数倍的时间。无论任何理由都一定要坚持这点。否则，请直接叫公司给您一个专门做多国语言化的小组，然后将Project 的时间延长一倍。

最后附带提醒一点！当您将英文的稿送去给别人翻译时，无论哪一国的语言，都请翻译人员坚守一个原则，翻译过后的字符串长度，不可超过原来英文的长度。除非那个字符串的翻译非常固定，例如像是Yes, No, OK, Cancel 等。翻译人员如果跟你说做不到，那么就换个翻译。这点也很重要，因为单是『字译』，一个句子长度可能远超过原来的句子。这会对你的程序产生极大的困扰。如果，为了这个原因，你要针对各语系调整你的画面。那么，请直接将Project 的时间再延长一倍，看看有没有可能做完。务必跟翻译人员坚持这点！请他用『意译』的方式调整句子的长度。我们这里所说的长度是画面上看到的长度(单位为像点)，不是指一个句子有几个字。如果你是在帮公司开发程序，建议你尽可能在一开始的时候就把多国语言的问题考虑进去。否则等到程序越写越大，才想到要改成多国语言时，就会非常痛苦了。

-----------------------------------------------------------------------
在Delphi编程的那段“古老”的日子里(就是在版本4之前)，在程序中使用字符串有两个基本的方法。你可以使用字符串将它们嵌入到源程序中，例如：

MessageDlg( 'Leave your stinkin' mitts off that button, fool!',mtError, [mbOK], 0);

或者，你可以创建一个文本文件(使用.rc扩展名)，例如：

STRINGTABLE DISCARDABLE
{
     1 "Dialog Expert"
     2 "Dialog Expert from demonstration Expert DLL."
     3 "Application Expert"
     4 "Application Expert from demonstration Expert DLL"
     5 "&Create"
     6 "&Next"
     7 "An application name is required!"
     8 "The application name is not a valid identifier."
     9 "The path entered does not exist."
    10 MAIN.PAS"
    11 "MAIN.DFM"
    12 "MAIN.TXT"
    ...
    // Variable names.
    20 "StatusLine"
    ...
}

然后你需要做的工作是将它编译成为资源文件，加入到Delphi的工程或者单元中，使用命令行工具Brcc32.exe编译，然后在程序中当你使用到这些字符串时，使用LoadStr等函数将它们从资源中提取出来。这看起来有点麻烦，而且你可能会被困在这麻烦的操作中，因此你可能会无休止地将字符串加入到你的源代码中而不是使用资源。

现在，resourcestring关键字的出现，可以帮助你摆脱这麻烦的工作。resourcestring带给我们两个好处：可以简单地加入字符串，而且所有的字符串集中保存在同一个位置；同时，使用resourcestring提供更好的内存管理，因为所有在resourcestring部分的字符串是以资源形式保存在应用程序中。

让我们赶快进入使用resourcestring关键字的新世界，增加一个单元到你的工程中，名字是ResStrngs(或者其它名字)，然后将所有的字符串(特别是那些将会被用户看到的字符串：列表的内容，错误消息等等)加入到这个单元的接口(Interface)部分，就像下面一样：


unit ResStrngs;

interface

resourcestring
     // 著名的军事家
    SGeneralElectric =        'General Electric';
    SGeneralMills =           'General Mills';
    SGeneralUsage =           'General Usage';
    SGeneralHospital =        'General Hospital';
    SGeneralLedger =          'General Ledger';
    SGeneralProtectionFault = 'General Protection Fault';
    SGeneralSQLError =        'General SQL Error';
    SGeneralLeeSpeaking =     'General Lee Speaking';
    SCorporalPunishment =     'Corporal Punishment';
    SSgtFury =                'Sgt. Fury';
    SSgtCarter =              'Sgt. Carter';
    SSgtSchultz =             'Sgt. Schultz';
    SSargentShriver =         'Sargent Shriver';
    SCaptKangaroo =           'Capt. Kangaroo';
    SCaptUnderpants =         'Capt. Underpants';
    SColonelKlink =           'Colonel Klink';
    SPrivateBenjamin =        'Private Benjamin';
    SPrivateProperty =        'Private Property';
    SLeftenantDan =           'Leftenant Dan';
    SMutineerChristian =      'Mutineer Christian';
    SAtlantaHawks =           'Atlanta Hawks';
     // 友好的提示
    SDontSleepInTheSubwayDarlin =
      'Don't sleep in the subway darlin'';
    // 你还可以继续增加字符串

implementation

end.


在任何可能引用这些字符串的单元的实现(implementation)部分的uses语句中加入此单元。然后，你可以这样使用这些字符串：


if ItIsPetulasVirtualHusband and HeIsLate then
    MessageDlg(SDontSleepInTheSubwayDarlin,
               mtInformation, [mbOK], 0);


还有一个例子：Borland/Inprise同样使用字符串资源，你可以看看..\source\vcl目录下的consts.pas、dbconsts.pas等文件。



将字符串集中放在resourcestring部分还有一个好处就是，通常程序员并不是最适合写用户将会看到的反馈信息或者错误消息的人，他们写的往往太技术性，例如：“模块xyz：生成子线程时发生一个未预期错误”，这对用户来说等于没说，也许“请在开始前保存修改”会更好一些。

将这些消息字符串单独保存在一个分离的文件中，可以让那些适合编写用户消息的人来处理（当然，要有程序员来当顾问，以便确定每条消息表示什么含义）。如果你不想让这些非编程人员来修改你的.pas文件，你可以将这些字符串保存到一个文件文件中交给他们处理，当他们处理完成后，你再将他们修改后的字符串复制到你的resourcestring部分。

最后但是也非常重要的是，当把用户或以看到的字符串都收集在一个地方，可以让你很容易地让你的应用程序国际化和本地化。使用Delphi的ITE(Integrated Translation Environment 集成翻译环境)，国际化和本地化你的应用程序的字符串几乎是半自动完成的。使用ITE，你可以为每种语言创建一个独立的.dll。如果你发布时带有多个.dll文件，通过程序运行的计算机系统的地区号，你的程序可以自动调用对应的.dll。

ITE主要的工具就是资源DLL向导(Resource DLL Wizard) (File | New | Resource DLL Wizard) 和翻译管理器(Translation Manager)。翻译管理器用于输入翻译的内容，可以查看Delphi的帮助文件的"Integrated Translation Environment"部分获得具体信息。

除Delphin提供的ITE外，还有第三方的相关工具。我喜欢使用来自 "in the box." 的工具，当然，像ITE一样，只要他工作得好。

---------------------------------------------------------------------------------
如何运行时修改
假设在Project1.exe里有一個ID为100的Resouce   String,Value为"abc",想将abc改为123，可以试如下操作

procedure   TForm1.Button2Click(Sender:   TObject);
var
      h:   THandle;
      b:   LongBool;
      s:   String;
      s2:   PWideChar;
      iMemAlloc:   Integer;
begin
      s   :=   Edit2.text;
      iMemAlloc   :=   Length(s)   *   SizeOf(WideChar);
      s2   :=   AllocMem(iMemAlloc);
      StringToWideChar(s,   S2,   iMemAlloc);
      h   :=   BeginUpdateResource('Project1.exe',   false);
      b   :=   UpdateResource(h,   RT_STRING,   MakeIntResource(100),   LANG_NEUTRAL,   s2,   iMemAlloc);
      if   b   then   ShowMessage('a');
      EndUpdateResource(h,   False);
end;
--------------------------------------------------------------------------

The better alternative to resourcestring
Instead of using resourcestrings, there is a better alternative:
ShowMessage (_('Action aborted'));

The _() is a pascal function inside gnugettext.pas, which translates the text. It returns a widestring, unlike resourcestring, which is limited to ansistring. You can use _() together with ansistrings as you like, because Delphi automatically converts widestrings to strings when needed. Another benefit of this is that you can write comments, that the translator can use to make better translations:

// Message shown to the user after the user clicks Abort button
ShowMessage (_('Action aborted'));

You can also write the comment in the same line:

ShowMessage (_('Action aborted')); // Message to user when clicking Abort button

But only the // style comment is supported - you cannot use { } or (* *) comments for this purpose.

Good comments normally lead to good translations. If the translator has a copy of the source code, poedit and kbabel can both show the location in the source code to the translator. This makes sense with _(), because the translator might get a good idea, what this is about, even if the translator isn't a programmer.

In other words, there are many reasons to use _() instead of resourcestrings. If you create a new application, don't even think about resourcestrings - just go directly for the _() solution.

---
本文章使用“国华软件”出品的博客内容离线管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-14 07:45 cpploverr 阅读(357) | 评论 (0) | 编辑收藏

软件加密技术和注册机制加密基础(转)

软件加密技术和注册机制加密基础

本文是一篇软件加密技术的基础性文章，简要介绍了软件加密的一些基本常识和一些加密产品，适用于国内软件开发商或者个人共享软件开发者阅读参考。
　
　1、加密技术概述

　　一个密码系统的安全性只在于密钥的保密性，而不在算法的保密性。

　　对纯数据的加密的确是这样。对于你不愿意让他看到这些数据（数据的明文）的人，用可靠的加密算法，只要破解者不知道被加密数据的密码，他就不可解读这些数据。

　　但是，软件的加密不同于数据的加密，它只能是“隐藏”。不管你愿意不愿意让他（合法用户，或 Cracker）看见这些数据（软件的明文），软件最终总要在机器上运行，对机器，它就必须是明文。既然机器可以“看见”这些明文，那么 Cracker，通过一些技术，也可以看到这些明文。

　　于是，从理论上，任何软件加密技术都可以破解。只是破解的难度不同而已。有的要让最高明的 Cracker 忙上几个月，有的可能不费吹灰之力，就被破解了。

　　所以，反盗版的任务（技术上的反盗版，而非行政上的反盗版）就是增加 Cracker 的破解难度。让他们花费在破解软件上的成本，比他破解这个软件的获利还要高。这样 Cracker 的破解变得毫无意义——谁会花比正版软件更多的钱去买盗版软件？

　　2、密码学简介

　　2.1 概念

　　（1）发送者和接收者

　　假设发送者想发送消息给接收者，且想安全地发送信息：她想确信偷听者不能阅读发送的消息。

　　（2）消息和加密

　　消息被称为明文。用某种方法伪装消息以隐藏它的内容的过程称为加密，加了密的消息称为密文，而把密文转变为明文的过程称为解密。

　　明文用M（消息）或P（明文）表示，它可能是比特流（文本文件、位图、数字化的语音流或数字化的视频图像）。至于涉及到计算机，P是简单的二进制数据。明文可被传送或存储，无论在哪种情况，M指待加密的消息。

　　密文用C表示，它也是二进制数据，有时和M一样大，有时稍大（通过压缩和加密的结合，C有可能比P小些。然而，单单加密通常达不到这一点）。加密函数E作用于M得到密文C，用数学表示为：

　　E（M）=C.

　　相反地，解密函数D作用于C产生M

　　D（C）=M.

　　先加密后再解密消息，原始的明文将恢复出来，下面的等式必须成立：

　　D（E（M））=M

　　（3）鉴别、完整性和抗抵赖

　　除了提供机密性外，密码学通常有其它的作用：.

　　（a）鉴别

　　消息的接收者应该能够确认消息的来源；入侵者不可能伪装成他人。

　　（b）完整性检验

　　消息的接收者应该能够验证在传送过程中消息没有被修改；入侵者不可能用假消息代替合法消息。

　　（c）抗抵赖

　　发送者事后不可能虚假地否认他发送的消息。

　　（4）算法和密钥

　　密码算法也叫密码，是用于加密和解密的数学函数。（通常情况下，有两个相关的函数：一个用作加密，另一个用作解密）

　　如果算法的保密性是基于保持算法的秘密，这种算法称为受限制的算法。受限制的算法具有历史意义，但按现在的标准，它们的保密性已远远不够。大的或经常变换的用户组织不能使用它们，因为每有一个用户离开这个组织，其它的用户就必须改换另外不同的算法。如果有人无意暴露了这个秘密，所有人都必须改变他们的算法。

　　更糟的是，受限制的密码算法不可能进行质量控制或标准化。每个用户组织必须有他们自己的唯一算法。这样的组织不可能采用流行的硬件或软件产品。但窃听者却可以买到这些流行产品并学习算法，于是用户不得不自己编写算法并予以实现，如果这个组织中没有好的密码学家，那么他们就无法知道他们是否拥有安全的算法。

　　尽管有这些主要缺陷，受限制的算法对低密级的应用来说还是很流行的，用户或者没有认识到或者不在乎他们系统中内在的问题。

　　现代密码学用密钥解决了这个问题，密钥用K表示。K可以是很多数值里的任意值。密钥K的可能值的范围叫做密钥空间。加密和解密运算都使用这个密钥（即运算都依赖于密钥，并用K作为下标表示），这样，加/解密函数现在变成：

　　EK（M）=C

　　DK（C）=M.

　　DK（EK（M））=M.

　　有些算法使用不同的加密密钥和解密密钥，也就是说加密密钥K1与相应的解密密钥K2不同，在这种情况下：

　　EK1（M）=C

　　DK2（C）=M

　　DK2 （EK1（M））=M

　　所有这些算法的安全性都基于密钥的安全性；而不是基于算法的细节的安全性。这就意味着算法可以公开，也可以被分析，可以大量生产使用算法的产品，即使偷听者知道你的算法也没有关系；如果他不知道你使用的具体密钥，他就不可能阅读你的消息。

　　密码系统由算法、以及所有可能的明文、密文和密钥组成的。

　　基于密钥的算法通常有两类：对称算法和公开密钥算法。下面将分别介绍：

　　2.2 对称密码算法

　　对称算法有时又叫传统密码算法，就是加密密钥能够从解密密钥中推算出来，反过来也成立。在大多数对称算法中，加/解密密钥是相同的。这些算法也叫秘密密钥算法或单密钥算法，它要求发送者和接收者在安全通信之前，商定一个密钥。对称算法的安全性依赖于密钥，泄漏密钥就意味着任何人都能对消息进行加/解密。只要通信需要保密，密钥就必须保密。

　　对称算法的加密和解密表示为：

　　EK（M）=C

　　DK（C）=M

　　对称算法可分为两类。一次只对明文中的单个比特（有时对字节）运算的算法称为序列算法或序列密码。另一类算法是对明文的一组比特亚行运算，这些比特组称为分组，相应的算法称为分组算法或分组密码。现代计算机密码算法的典型分组长度为64比特——这个长度大到足以防止分析破译，但又小到足以方便使用（在计算机出现前，算法普遍地每次只对明文的一个字符运算，可认为是序列密码对字符序列的运算）。

　　2.3 公开密码算法

　　公开密钥算法（也叫非对称算法）是这样设计的：用作加密的密钥不同于用作解密的密钥，而且解密密钥不能根据加密密钥计算出来（至少在合理假定的长时间内）。之所以叫做公开密钥算法，是因为加密密钥能够公开，即陌生者能用加密密钥加密信息，但只有用相应的解密密钥才能解密信息。在这些系统中，加密密钥叫做公开密钥（简称公钥），解密密钥叫做私人密钥（简称私钥）。私人密钥有时也叫秘密密钥。为了避免与对称算法混淆，此处不用秘密密钥这个名字。

　　用公开密钥K加密表示为

　　EK（M）=C.

　　虽然公开密钥和私人密钥是不同的，但用相应的私人密钥解密可表示为：

　　DK（C）=M

　　有时消息用私人密钥加密而用公开密钥解密，这用于数字签名（后面将详细介绍），尽管可能产生混淆，但这些运算可分别表示为：

　　EK（M）=C

　　DK（C）=M

　　当前的公开密码算法的速度，比起对称密码算法，要慢的多，这使得公开密码算法在大数据量的加密中应用有限。

　　2.4 单向散列函数

　　单向散列函数 H（M）作用于一个任意长度的消息 M，它返回一个固定长度的散列值 h，其中 h 的长度为 m .

　　输入为任意长度且输出为固定长度的函数有很多种，但单向散列函数还有使其单向的其它特性：

　　（1）给定 M ，很容易计算 h ；

　　（2）给定 h ，根据 H（M） = h 计算 M 很难；

　　（3）给定 M ，要找到另一个消息 M‘ 并满足 H（M） = H（M’）很难。

　　在许多应用中，仅有单向性是不够的，还需要称之为“抗碰撞”的条件：

　　要找出两个随机的消息 M 和 M‘，使 H（M） = H（M’）满足很难。

　　由于散列函数的这些特性，由于公开密码算法的计算速度往往很慢，所以，在一些密码协议中，它可以作为一个消息 M 的摘要，代替原始消息 M，让发送者为 H（M）签名而不是对 M 签名 .

　　如 SHA 散列算法用于数字签名协议 DSA中。

　　2.5 数字签名

　　提到数字签名就离不开公开密码系统和散列技术。

　　有几种公钥算法能用作数字签名。在一些算法中，例如RSA，公钥或者私钥都可用作加密。用你的私钥加密文件，你就拥有安全的数字签名。在其它情况下，如DSA，算法便区分开来了？？数字签名算法不能用于加密。这种思想首先由Diffie和Hellman提出 .

　　基本协议是简单的：

　　（1） A 用她的私钥对文件加密，从而对文件签名。

　　（2） A 将签名的文件传给B.

　　（3） B用A的公钥解密文件，从而验证签名。

　　这个协议中，只需要证明A的公钥的确是她的。如果B不能完成第（3）步，那么他知道签名是无效的。

　　这个协议也满足以下特征：

　　（1）签名是可信的。当B用A的公钥验证信息时，他知道是由A签名的。

　　（2）签名是不可伪造的。只有A知道她的私钥。

　　（3）签名是不可重用的。签名是文件的函数，并且不可能转换成另外的文件。

　　（4）被签名的文件是不可改变的。如果文件有任何改变，文件就不可能用A的公钥验证。

　　（5）签名是不可抵赖的。B不用A的帮助就能验证A的签名。

　　在实际应用中，因为公共密码算法的速度太慢，签名者往往是对消息的散列签名而不是对消息本身签名。这样做并不会降低签名的可信性。

　　3 当前流行的一些软件保护技术

　　3.1 序列号保护

　　数学算法一项都是密码加密的核心，但在一般的软件加密中，它似乎并不太为人们关心，因为大多数时候软件加密本身实现的都是一种编程的技巧。但近几年来随着序列号加密程序的普及，数学算法在软件加密中的比重似乎是越来越大了。

　　看看在网络上大行其道的序列号加密的工作原理。当用户从网络上下载某个shareware——共享软件后，一般都有使用时间上的限制，当过了共享软件的试用期后，你必须到这个软件的公司去注册后方能继续使用。注册过程一般是用户把自己的私人信息（一般主要指名字）连同信用卡号码告诉给软件公司，软件公司会根据用户的信息计算出一个序列码，在用户得到这个序列码后，按照注册需要的步骤在软件中输入注册信息和注册码，其注册信息的合法性由软件验证通过后，软件就会取消掉本身的各种限制，这种加密实现起来比较简单，不需要额外的成本，用户购买也非常方便，在互联网上的软件80%都是以这种方式来保护的。

　　软件验证序列号的合法性过程，其实就是验证用户名和序列号之间的换算关系是否正确的过程。其验证最基本的有两种，一种是按用户输入的姓名来生成注册码，再同用户输入的注册码比较，公式表示如下：

　　序列号 = F（用户名）

　　但这种方法等于在用户软件中再现了软件公司生成注册码的过程，实际上是非常不安全的，不论其换算过程多么复杂，解密者只需把你的换算过程从程序中提取出来就可以编制一个通用的注册程序。

　　另外一种是通过注册码来验证用户名的正确性，公式表示如下：

　　用户名称 = F逆（序列号）（如ACDSEE）

　　这其实是软件公司注册码计算过程的反算法，如果正向算法与反向算法不是对称算法的话，对于解密者来说，的确有些困难，但这种算法相当不好设计。

　　于是有人考虑到以下的算法：

　　F1（用户名称） = F2（序列号）

　　F1、F2是两种完全不同的的算法，但用户名通过F1算法计算出的特征字等于序列号通过F2算法计算出的特征字，这种算法在设计上比较简单，保密性相对以上两种算法也要好的多。如果能够把F1、F2算法设计成不可逆算法的话，保密性相当的好；可一旦解密者找到其中之一的反算法的话，这种算法就不安全了。一元算法的设计看来再如何努力也很难有太大的突破，那么二元呢？

　　特定值 = F（用户名，序列号）

　　这个算法看上去相当不错，用户名称与序列号之间的关系不再那么清晰了，但同时也失去了用户名于序列号的一一对应关系，软件开发者必须自己维护用户名称与序列号之间的唯一性，但这似乎不是难以办到的事，建个数据库就可以了。当然也可以把用户名称和序列号分为几个部分来构造多元的算法。

　　特定值 = F（用户名1，用户名2，...序列号1，序列号2...）

　　现有的序列号加密算法大多是软件开发者自行设计的，大部分相当简单。而且有些算法作者虽然下了很大的功夫，效果却往往得不到它所希望的结果。

　　3.2 时间限制

　　有些程序的试用版每次运行都有时间限制，例如运行10分钟或20分钟就停止工作，必须重新运行该程序才能正常工作。这些程序里面自然有个定时器来统计程序运行的时间。

　　这种方法使用的较少。

　　3.3 Key File 保护

　　Key File（注册文件）是一种利用文件来注册软件的保护方式。Key File一般是一个小文件，可以是纯文本文件，也可以是包含不可显示字符的二进制文件，其内容是一些加密过或未加密的数据，其中可能有用户名、注册码等信息。文件格式则由软件作者自己定义。试用版软件没有注册文件，当用户向作者付费注册之后，会收到作者寄来的注册文件，其中可能包含用户的个人信息。用户只要将该文件放入指定的目录，就可以让软件成为正式版。该文件一般是放在软件的安装目录中或系统目录下。软件每次启动时，从该文件中读取数据，然后利用某种算法进行处理，根据处理的结果判断是否为正确的注册文件，如果正确则以注册版模式来运行。

　　这种保护方法使用也不多。

　　3.4 CD-check

　　即光盘保护技术。程序在启动时判断光驱中的光盘上是否存在特定的文件，如果不存在则认为用户没有正版光盘，拒绝运行。在程序运行的过程当中一般不再检查光盘的存在与否。Windows下的具体实现一般是这样的：先用GetLogicalDriveStrings（）或GetLogicalDrives（）得到系统中安装的所有驱动器的列表，然后再用GetDriveType（）检查每一个驱动器，如果是光驱则用CreateFileA（）或FindFirstFileA（）等函数检查特定的文件存在与否，并可能进一步地检查文件的属性、大小、内容等。

　　3.5 软件狗

　　软件狗是一种智能型加密工具。它是一个安装在并口、串口等接口上的硬件电路，同时有一套使用于各种语言的接口软件和工具软件。当被狗保护的软件运行时，程序向插在计算机上的软件狗发出查询命令，软件狗迅速计算查询并给出响应，正确的响应保证软件继续运行。如果没有软件狗，程序将不能运行，复杂的软硬件技术结合在一起防止软件盗版。真正有商业价值得软件一般都用软件狗来保护。

　　平时常见的狗主要有“洋狗”（国外狗）和“土狗”（国产狗）。这里“洋狗”主要指美国的彩虹和以色列的HASP，“土狗”主要有金天地（现在与美国彩虹合资，叫“彩虹天地”）、深思、尖石。总的说来，“洋狗”在软件接口、加壳、反跟踪等“软”方面没有“土狗”好，但在硬件上破解难度非常大；而“土狗”在软的方面做的很好，但在硬件上不如“洋狗”，稍有单片机功力的人，都可以复制。

　　3.6 软盘加密

　　通过在软盘上格式化一些非标准磁道，在这些磁道上写入一些数据，如软件的解密密钥等等。这种软盘成为“钥匙盘”。软件运行时用户将软盘插入，软件读取这些磁道中的数据，判断是否合法的“钥匙盘”。

　　软盘加密还有其它一些技术，如弱位加密等等。

　　随着近年来软盘的没落，这种方法基本上退出了历史舞台。

　　3.7 将软件与机器硬件信息结合

　　用户得到（买到或从网上下载）软件后，安装时软件从用户的机器上取得该机器的一些硬件信息（如硬盘序列号、BOIS序列号等等），然后把这些信息和用户的序列号、用户名等进行计算，从而在一定程度上将软件和硬件部分绑定。用户需要把这一序列号用Email、电话或邮寄等方法寄给软件提供商或开发商，软件开发商利用注册机（软件）产生该软件的注册号寄给用户即可。软件加密虽然加密强度比硬件方法较弱，但它具有非常廉价的成本、方便的使用方法等优点。非常适合做为采用光盘（CDROM）等方式发授软件的加密方案。

　　此种加密算法的优点

　　· 不同机器注册码不同。用户获得一个密码只能在一台机器上注册使用软件。不同于目前大多软件采用的注册方法，即只要知道注册码，可在任何机器上安装注册。

　　· 不需要任何硬件或软盘

　　· 可以选择控制软件运行在什么机器、运行多长时间或次数等

　　· 可让软件在不注册前的功能为演示软件，只能运行一段时间或部分功能。注册后就立即变为正式软件

　　· 采用特别技术，解密者很难找到产生注册号码的规律

　　· 在使用注册号产生软件（注册机）时可采用使用密码、密钥盘、总次数限制等方法

　　· 方便易用，价格低廉。

　　这种加密还有以下特点

　　1、注册加密的软件，只能在一台机器上安装使用。把软件拷贝到其它机器上不能运行。

　　2、若用户想在另一机器上安装运行，必须把软件在这一机器上运行时的序列号，寄给软件出版商换取注册密码。当然应再交一份软件费用。

　　3、此加密方法特别适应在因特网上发布的软件及用光盘发布的软件。

　　注释：

　　1、“加密技术概述”部分内容参考了大学教材“密码学基础”。

　　2、“当前流行的一些软件保护技术”部分内容参考了“加密与解密--软件保护技术及完全解决方案”一文。

---
本文章使用“国华软件”出品的博客内容管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-12 10:54 cpploverr 阅读(239) | 评论 (0) | 编辑收藏

Delphi in a Unicode World Part III: Unicodifying Your Code

By: Nick Hodges

原文链接：http://dn.codegear.com/article/38693

Abstract: This article describes what you need to do to get your code ready for Delphi 2009.

As discussed in Part I of this series, we saw Delphi 2009 will use by default a UTF-16 based string. As a result, certain code idioms within existing code may need to be changed. In general, the large majority of existing code will work just fine with Delphi 2009. As you’ll see, most of the code changes that need to be made are quite specific and somewhat esoteric. However, some specific code idioms will need to be reviewed and perhaps have changes made to ensure that the code works properly with UnicodeString.

For example, any code that manipulates or does pointer operations on strings should be examined for Unicode compatibility. More specifically, any code that:

Assumes that SizeOf(Char) is 1
Assumes that the Length of a string is equal to the number of bytes in the string
Writes or reads strings from some persistent storage or uses a string as a data buffer

should be reviewed to ensure that those assumptions are not persisted in code. Code that writes to or reads from persistent storage needs to ensure that the correct number of bytes are being read or written, as a single byte no longer represents a single character.

Generally, any needed code changes should be straightforward and can be done with a minimal amount of effort.

Areas That Should “Just Work”

This section discusses area of code that should continue to work, and should not require any changes to work properly with the new UnicodeString. All of the VCL and RTL have been updated to work as expected in Delphi 2009, and with very, very few exceptions, such will be the case. For instance, TStringList is now completely Unicode-aware, and all existing TStringList code should work as before. However, TStringList has been enhanced to work specifically with Unicode, so if you want to take advantage of that new functionality, you can, but you need not if you don’t want to.

General Use of String Types

In general, code that uses the string type should work as before. There is no need to re-declare string variables as AnsiString types, except as discussed below. String declarations should only be changed to be AnsiString when dealing with storage buffers or other types of code that uses the string as a data buffer.

The Runtime Library

The runtime library additions are discussed extensively in Part II.

That article doesn’t mention a new unit added to the RTL – AnsiString.pas. This unit exists for backwards compatibility with code that chooses to use or requires the use of AnsiString within it.

Runtime library code runs as expected and in general requires no change. The areas that do need change are described below.

The VCL

The entire VCL is Unicode aware. All existing VCL components work right out of the box just as they always have. The vast majority of your code using the VCL should continue to work as normal. We’ve done a lot of work to ensure that the VCL is both Unicode ready and backwards compatible. Normal VCL code that doesn’t do any specific string manipulation will work as before.

String Indexing

String Indexing works exactly as before, and code that indexes into strings doesn’t need to be changed:

var
S: string;
C: Char;
begin
S := ‘This is a string’;
C := S[1];  // C will hold ‘T’, but of course C is a WideChar
end;

Length/Copy/Delete/SizeOf with Strings

Copy will still work as before without change. So will Delete and all the SysUtils-based string manipulation routines.

Calls to Length(SomeString) will, as always, return the number of elements in the passed string.

Calls to SizeOf on any string identifier will return 4, as all string declarations are references and the size of a pointer is 4.

Calls to Length on any string will return the number of elements in the string.

Consider the following code:

var
  S: string;
begin
    S:= 'abcdefghijklmnopqrstuvwxyz';
    WriteLn('Length = ', Length(S));
    WriteLn('SizeOf = ', SizeOf(S));
    WriteLn('TotalBytes = ', Length(S) * SizeOf(S[1]));
    ReadLn;
end.

The output of the above is as follows:

Hide image
dos

Pointer Arithmetic on PChar

Pointer arithmetic on PChar should continue to work as before. The compiler knows the size of PChar, so code like the following will continue to work as expected:

var
p: PChar;
MyString: string;
begin
  ...
  p := @MyString[1];
Inc(p);
...
end;

This code will work exactly the same as with previous versions of Delphi – but of course the types are different: PChar is now a PWideChar and MyString is now a UnicodeString.

ShortString

ShortString remains unchanged in both functionality and declaration, and will work just as before.

ShortString declarations allocate a buffer for a specific number of AnsiChars. Consider the following code:

var
  S: string[26];
begin
    S:= 'abcdefghijklmnopqrstuvwxyz';
    WriteLn('Length = ', Length(S));
    WriteLn('SizeOf = ', SizeOf(S));
    WriteLn('TotalBytes = ', Length(S) * SizeOf(S[1]));
    ReadLn;
end.

It has the following output:

Hide image
dos

Note that the total bytes of the alphabet is 26 – showing that the variable is holding AnsiChars.

In addition, consider the following code:

type
TMyRecord = record
  String1: string[20];
  String2: string[15];
end;

This record will be laid out in memory exactly as before – it will be a record of two AnsiStrings with AnsiChars in them. If you’ve got a File of Rec of a record with short strings, then the above code will work as before, and any code reading and writing such a record will work as before with no changes.

However, remember that Char is now a WideChar, so if you have some code that grabs those records out of a file and then calls something like:

var
MyRec: TMyRecord;
SomeChar: Char;
begin
// Grab MyRec from a file...
SomeChar := MyRec.String1[3];
...
end;

then you need to remember that SomeChar will convert the AnsiChar in String1[3] to a WideChar. If you want this code to work as before, change the declaration of SomeChar:

var
MyRec: TMyRecord;
SomeChar: AnsiChar; // Now declared as an AnsiChar for the shortstring index
begin
// Grab MyRec from a file...
SomeChar := MyRec.String1[3];  
...
end;

Areas That Should be Reviewed

This next section describes the various semantic code constructs that should be reviewed in existing code for Unicode compatibility. Because Char now equals WideChar, assumptions about the size in bytes of a character array or string may be invalid. The following lists a number of specific code constructs that should be examined to ensure that they are compatible with the new UnicodeString type.

SaveToFile/LoadFromFile

SaveToFile and LoadFromFile calls could very well go under the “Just Works” section above, as these calls will read and write just as they did before. However, you may want to consider using the new overloaded versions of these calls if you are going to be dealing with Unicode data when using them.

For instance, TStrings now includes the following set of overloaded methods:

procedure SaveToFile(const FileName: string); overload; virtual;
procedure SaveToFile(const FileName: string; Encoding: TEncoding); overload; virtual;

The second method above is the new overload that includes an encoding parameter that determines how the data will be written out to the file. (You can read Part II for an explanation of the TEncoding type.) If you call the first method above, the string data will be saved as it always has been – as ANSI data. Therefore, your existing code will work exactly as it always has.

However, if you put some Unicode string data into the text to be written out, you will need to use the second overload, passing a specific TEncoding type. If you do not, the strings will be written out as ANSI data, and data loss will likely result.

Therefore, the best idea here would be to review your SaveToFile and LoadFromFile calls, and add a second parameter to them to indicate how you’d like your data saved. If you don’t think you’ll ever be adding or using Unicode strings, though, you can leave things as they are.

Use of the Chr Function

Existing code that needs to create a Char from an integer value may make use of the Chr function. Certain uses of the Chr function may result in the following error:

[DCC Error] PasParser.pas(169): E2010 Incompatible types: 'AnsiChar' and 'Char'

If code using the Chr function is assigning the result to an AnsiChar, then this error can easily be removed by replacing the Chr function with a cast to AnsiChar.

So, this code

MyChar := chr(i);

Can be changed to

MyChar := AnsiChar(i);

Sets of Characters

Probably the most common code idiom that will draw the attention of the compiler is the use of characters in sets. In the past, a character was one byte, so holding characters in a set was no problem. But now, Char is declared as a WideChar, and thus cannot be held in a set any longer. So, if you have some code that looks like this:

procedure TDemoForm.Button1Click(Sender: TObject);
var
  C: Char;
begin
  C := Edit1.Text[1];

  if C in ['a'..'z', 'A'..'Z'] then
  begin
   Label1.Caption := 'It is there';
end;
end;

and you compile it, you’ll get a warning that looks something like this:

[DCC Warning] Unit1.pas(40): W1050 WideChar reduced to byte char in set expressions.  Consider using 'CharInSet' function in 'SysUtils' unit.

You can, if you like, leave the code that way – the compiler will “know” what you are trying to do and generate the correct code. However, if you want to get rid of the warning, you can use the new CharInSet function:

  if CharInSet(C, ['a'..'z', 'A'..'Z']) then
  begin
   Label1.Caption := 'It is there';
  end;

The CharInSet function will return a Boolean value, and compile without the compiler warning.

Using Strings as Data Buffers

A common idiom is to use a string as a data buffer. It’s common because it’s been easy --manipulating strings is generally pretty straight forward. However, existing code that does this will almost certainly need to be adjusted given the fact that string now is a UnicodeString.

There are a couple of ways to deal with code that uses a string as a data buffer. The first is to simply declare the variable being used as a data buffer as an AnsiString instead of string. If the code uses Char to manipulate bytes in the buffer, declare those variables as AnsiChar. If you choose this route, all your code will work as before, but you do need to be careful that you’ve explicitly declared all variables accessing the string buffer to be ANSI types.

The second and preferred way dealing with this situation is to convert your buffer from a string type to an array of bytes, or TBytes. TBytes is designed specifically for this purpose, and works as you likely were using the string type previously.

Calls to SizeOf on Buffers

Calls to SizeOf when used with character arrays should be reviewed for correctness. Consider the following code:

procedure TDemoForm.Button1Click(Sender: TObject);
var
var
  P: array[0..16] of Char;
begin
  StrPCopy(P, 'This is a string');
  Memo1.Lines.Add('Length of P is ' +  IntToStr(Length(P)));
  Memo1.Lines.Add('Size of P is ' +  IntToStr(SizeOf(P)));
end;

This code will display the following in Memo1:

Length of P is 17
Size of P is 34

In the above code, Length will return the number of characters in the given string (plus the null termination character), but SizeOf will return the total number of Bytes used by the array, in this case 34, i.e. two bytes per character. In previous versions, this code would have returned 17 for both.

Use of FillChar

Calls to FillChar need to be reviewed when used in conjunction with strings or a character. Consider the following code:

 var
   Count: Integer;
   Buffer: array[0..255] of Char;
 begin
   // Existing code - incorrect when string = UnicodeString
   Count := Length(Buffer);
   FillChar(Buffer, Count, 0);
   
   // Correct for Unicode – either one will be correct
   Count := SizeOf(Buffer);                // <<-- Specify buffer size in bytes
   Count := Length(Buffer) * SizeOf(Char); // <<-- Specify buffer size in bytes
   FillChar(Buffer, Count, 0);
 end;

Length returns the size in characters but FillChar expects Count to be in bytes. In this case, SizeOf should be used instead of Length (or Length needs to be multiplied by the size of Char).

In addition, because the default size of a Char is 2, FillChar will fill a string with bytes, not Char as previously

Example:

var
  Buf: array[0..32] of Char;
begin
  FillChar(Buf, Length(Buf), #9);
end;

This doesn’t fill the array with code point $09 but code point $0909. In order to get the expected result the code needs to be changed to:

var
  Buf: array[0..32] of Char;
begin
  ..
  StrPCopy(Buf, StringOfChar(#9, Length(Buf)));
  ..
end;

Using Character Literals

The following code

  if Edit1.Text[1] = #128 then

will recognize the Euro symbol and thus evaluate to True in most ANSI codepages. However, it will evaluate to False in Delphi 2009 because while #128 is the euro sign in most ANSI code pages, it is a control character in Unicode. In Unicode, Euro symbol is #$20AC.

Developers should replace any characters #128-#255 with literals, when converting to Delphi 2009, since:

  if Edit1.Text[1] = '€' then

will work the same as #128 in ANSI, but also work (i.e., recognize the Euro) in Delphi 2009 (where '€' is #$20AC)

Calls to Move

Calls to Move need to be reviewed when strings or character arrays are used. Consider the following code:

 var
   Count: Integer;
   Buf1, Buf2: array[0..255] of Char;
 begin
   // Existing code - incorrect when string = UnicodeString
   Count := Length(Buf1);
   Move(Buf1, Buf2, Count);
   
   // Correct for Unicode
   Count := SizeOf(Buf1);                // <<-- Specify buffer size in bytes
   Count := Length(Buf1) * SizeOf(Char); // <<-- Specify buffer size in bytes
   Move(Buf1, Buf2, Count);
 end;

Length returns the size in characters but Move expects Count to be in bytes. In this case, SizeOf should be used instead of Length (or Length needs to be multiplied by the size of Char).

Read/ReadBuffer methods of TStream

Calls to TStream.Read/ReadBuffer need to be reviewed when strings or character arrays are used. Consider the following code:

 var
   S: string;
   L: Integer;
   Stream: TStream;
   Temp: AnsiString;
 begin
   // Existing code - incorrect when string = UnicodeString
   Stream.Read(L, SizeOf(Integer));
   SetLength(S, L);
   Stream.Read(Pointer(S)^, L);
   
   // Correct for Unicode string data
   Stream.Read(L, SizeOf(Integer));
   SetLength(S, L);
   Stream.Read(Pointer(S)^, L * SizeOf(Char));  // <<-- Specify buffer size in bytes
   
   // Correct for Ansi string data
   Stream.Read(L, SizeOf(Integer));
   SetLength(Temp, L);              // <<-- Use temporary AnsiString
   Stream.Read(Pointer(Temp)^, L * SizeOf(AnsiChar));  // <<-- Specify buffer size in bytes
   S := Temp;                       // <<-- Widen string to Unicode
 end;

Note: The solution depends on the format of the data being read. See the new TEncoding class described above to assist in properly encoding the text in the stream.

Write/WriteBuffer

As with Read/ReadBuffer, calls to TStream.Write/WriteBuffer need to be reviewed when strings or character arrays are used. Consider the following code:

 var
   S: string;
   Stream: TStream;
   Temp: AnsiString;
 begin
   // Existing code - incorrect when string = UnicodeString
   Stream.Write(Pointer(S)^, Length(S));
   
   // Correct for Unciode data
   Stream.Write(Pointer(S)^, Length(S) * SizeOf(Char)); // <<-- Specifcy buffer size in bytes
   
   // Correct for Ansi data
   Temp := S;          // <<-- Use temporary AnsiString
   Stream.Write(Pointer(Temp)^, Length(Temp) * SizeOf(AnsiChar));// <<-- Specify buffer size in bytes
 end;

Note: The solution depends on the format of the data being written. See the new TEncoding class described above to assist in properly encoding the text in the stream.

LeadBytes

Replace calls like this:

 if Str[I] in LeadBytes then

with the IsLeadChar function:

 if IsLeadChar(Str[I]) then

TMemoryStream

In cases where a TMemoryStream is being used to write out a text file, it will be useful to write out a Byte Order Mark (BOM) as the first entry in the file. Here is an example of writing the BOM to the file:

 var
   BOM: TBytes;
 begin
   ...
   BOM := TEncoding.UTF8.GetPreamble;
   Write(BOM[0], Length(BOM));

All writing code will need to be changed to UTF8 encode the Unicode string:

 var
   Temp: Utf8String;
 begin
   ...
   Temp := Utf8Encode(Str); // <-- Str is the string being written out to the file.
   Write(Pointer(Temp)^, Length(Temp));
 //Write(Pointer(Str)^, Length(Str)); <-- this is the original call to write the string to the file.

TStringStream

TStringStream now descends from a new type, TByteStream. TByteStream adds a property named Bytes which allows for direct access to the bytes with a TStringStream. TStringStream works as it always has, with the exception that the string it holds is a Unicode-based string.

MultiByteToWideChar

Calls to MultiByteToWideChar can simply be removed and replaced with a simple assignment. An example when using MultiByteToWideChar:

 procedure TWideCharStrList.AddString(const S: string);
 var
   Size, D: Integer;
 begin
   Size := SizeOf(S);
   D := (Size + 1) * SizeOf(WideChar);
   FList[FUsed] := AllocMem(D);
   MultiByteToWideChar(0, 0, PChar(S), Size, FList[FUsed], D);
  Inc(FUsed);
 end;

And after the change to Unicode, this call was changed to support compiling under both ANSI and Unicode:

procedure TWideCharStrList.AddString(const S: string);
var
   L, D: Integer;
begin
   FList[FUsed] := StrNew(PWideChar(S));
   Inc(FUsed);
 end;

SysUtils.AppendStr

This method is deprecated, and as such, is hard-coded to use AnsiString and no UnicodeString overload is available.

Replace calls like this:

 AppendStr(String1, String2);

with code like this:

 String1 := String1 + String2;

Or, better yet, use the new TStringBuilder class to concatenate strings.

GetProcAddress

Calls to GetProcAddress should always use PAnsiChar (there is no W-suffixed function in the SDK). For example:

 procedure CallLibraryProc(const LibraryName, ProcName: string);
 var
   Handle: THandle;
   RegisterProc: function: HResult stdcall;
 begin
   Handle := LoadOleControlLibrary(LibraryName, True);
   @RegisterProc := GetProcAddress(Handle, PAnsiChar(AnsiString(ProcName)));
 end;

Note: Windows.pas will provide an overloaded method that will do this conversion.

Use of PChar() casts to enable pointer arithmetic on non-char based pointer types

In previous versions, not all typed pointers supported pointer arithmetic. Because of this, the practice of casting various non-char pointers to PChar is used to enable pointer arithmetic. For Delphi 2009, pointer arithmetic can be enabled using a compiler directive, and it is specifically enabled for the PByte type. Therefore, if you have code like the following that casts pointer data to PChar for the purpose of performing pointer arithmetic on it:

 function TCustomVirtualStringTree.InternalData(Node: PVirtualNode): Pointer;
 begin
   if (Node = FRoot) or (Node = nil) then
     Result := nil
   else
     Result := PChar(Node) + FInternalDataOffset;
 end;

You should change this to use PByte rather than PChar:

 function TCustomVirtualStringTree.InternalData(Node: PVirtualNode): Pointer;
 begin
   if (Node = FRoot) or (Node = nil) then
     Result := nil
   else
     Result := PByte(Node) + FInternalDataOffset;
 end;

In the above snippet, Node is not actually character data. It is being cast to a PChar merely for the purpose of using pointer arithmetic to access data that is a certain number of bytes after Node. This worked previously because SizeOf(Char) = Sizeof(Byte). This is no longer true, and to ensure the code remains correct, it needs to be change to use PByte rather than PChar. Without the change, Result will end up pointing to the incorrect data.

Variant open array parameters

If you have code that uses TVarRec to handle variant open array parameters, you may need to adjust it to handle UnicodeString. A new type vtUnicodeString is defined for use with UnicodeStrings. The UnicodeString data is held in vUnicodeString. See the following snippet from DesignIntf.pas, showing a case where new code needed to be added to handle the UnicodeString type.

 procedure RegisterPropertiesInCategory(const CategoryName: string;
   const Filters: array of const); overload;
 var
   I: Integer;
 begin
   if Assigned(RegisterPropertyInCategoryProc) then
     for I := Low(Filters) to High(Filters) do
       with Filters[I] do
         case vType of
           vtPointer:
             RegisterPropertyInCategoryProc(CategoryName, nil,
               PTypeInfo(vPointer), );
           vtClass:
             RegisterPropertyInCategoryProc(CategoryName, vClass, nil, );
           vtAnsiString:
             RegisterPropertyInCategoryProc(CategoryName, nil, nil,
               string(vAnsiString));
           vtUnicodeString:
             RegisterPropertyInCategoryProc(CategoryName, nil, nil,
               string(vUnicodeString));
         else
           raise Exception.CreateResFmt(@sInvalidFilter, [I, vType]);
         end;
 end;

CreateProcessW

The Unicode version of CreateProcess (CreateProcessW) behaves slightly differently than the ANSI version. To quote MSDN in reference to the lpCommandLine parameter:

"The Unicode version of this function, CreateProcessW, can modify the contents of this string. Therefore, this parameter cannot be a pointer to read-only memory (such as a const variable or a literal string). If this parameter is a constant string, the function may cause an access violation."

Because of this, some existing code that calls CreateProcess may start giving Access Violations when compiled in Delphi 2009.

Examples of problematic code:

Passing in a string constant

   CreateProcess(nil, 'foo.exe', nil, nil, False, 0, nil, nil, StartupInfo, ProcessInfo);

Passing in a constant expression

const
cMyExe = 'foo.exe'
begin
CreateProcess(nil, cMyExe, nil, nil, False, 0, nil, nil, StartupInfo, ProcessInfo);
end;

Passing in a string with a Reference Count of -1:

const
  cMyExe = 'foo.exe'
var
  sMyExe: string;
begin
  sMyExe := cMyExe;
  CreateProcess(nil, PChar(sMyExe), nil, nil, False, 0, nil, nil, StartupInfo,    ProcessInfo);
end;

Code to search for

The following is a list of code patterns that you might want to search for to ensure that your code is properly Unicode-enabled.

Search for any uses of of Char or of AnsiChar” to ensure that the buffers are used correctly for Unicode
Search for instances “string[“ to ensure that the characters reference are placed into Chars (i.e. WideChar).
Check for the explicit use of AnsiString, AnsiChar, and PAnsiChar to see if it is still necessary and correct.
Search for explicit use of ShortString to see if it is still necessary and correct
Search for Length( to ensure that it isn’t assuming that Length is the same as SizeOf
Search for Copy(, Seek(, Pointer(, AllocMem(, and GetMem( to ensure that they are correctly operating on strings or array of Chars.

They represent code constructs that could potentially need to be changed to support the new UnicodeString type.

Conclusion

So that sums up the types of code idioms you need to review for correctness in the Unicode world. In general, most of your code should work. Most of the warnings your code will receive can be easily fixed up. Most of the code patterns you’ll need to review are generally uncommon, so it is likely that much if not all of your existing code will work just fine.

（出处：http://dn.codegear.com/article/38693）

---
本文章使用“国华软件”出品的博客内容管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-11 22:49 cpploverr 阅读(369) | 评论 (0) | 编辑收藏

Delphi in a Unicode World Part II

Delphi in a Unicode World Part II: New RTL Features and Classes to Support Unicode

By: Nick Hodges

原文链接:http://dn.codegear.com/article/38498

Abstract: This article will cover the new features of the Tiburon Runtime Library that will help handle Unicode strings.

Introduction

In Part I, we saw how Unicode support is a huge benefit for Delphi developers by enabling communication with all characters set in the Unicode universe. We saw the basics of the UnicodeString type and how it will be used in Delphi

In Part II, we’ll look at some of the new features of the Delphi Runtime Library that support Unicode and general string handling.

TCharacter Class

The Tiburon RTL includes a new class called TCharacter, which is found in the Character unit. It is a sealed class that consists entirely of static class functions. Developers should not create instances of TCharacter, but rather merely call its static class methods directly. Those class functions do a number of things, including:

Convert characters to upper or lower case
Determine whether a given character is of a certain type, i.e. is the character a letter, a number, a punctuation mark, etc.

TCharacter uses the standards set forth by the Unicode consortium.

Developers can use the TCharacter class to do many things previously done with sets of chars. For instance, this code:

uses
Character;

begin
if MyChar in [‘a’...’z’, ‘A’...’Z’] then
begin
  ...
end;
end;

can be easily replaced with

uses
  Character;

begin
if TCharacter.IsLetter(MyChar) then
begin
    ...
end;
end;

The Character unit also contains a number of standalone functions that wrap up the functionality of each class function from TCharacter, so if you prefer a simple function call, the above can be written as:

uses
  Character;

begin
if IsLetter(MyChar) then
begin
    ...
end;
end;

Thus the TCharacter class can be used to do most any manipulation or checking of characters that you might care to do.

In addition, TCharacter contains class methods to determine if a given character is a high or low surrogate of a surrogate pair.

TEncoding Class

The Tiburon RTL also includes a new class called TEncoding. Its purpose is to define a specific type of character encoding so that you can tell the VCL what type of encoding you want used in specific situations.

For instance, you may have a TStringList instance that contains text that you want to write out to a file. Previously, you would have written:

begin
  ...
  MyStringList.SaveToFile(‘SomeFilename.txt’);  
  ...
end;

and the file would have been written out using the default ANSI encoding. That code will still work fine – it will write out the file using ANSI string encoding as it always has, but now that Delphi supports Unicode string data, developers may want to write out string data using a specific encoding. Thus, SaveToFile (as well as LoadFromFile) now take an optional second parameter that defines the encoding to be used:

begin
  ...
  MyStringList.SaveToFile(‘SomeFilename.txt’, TEncoding.Unicode);  
  ...
end;

Execute the above code and the file will be written out as a Unicode (UTF-16) encoded text file.

TEncoding will also convert a given set of bytes from one encoding to another, retrieve information about the bytes and/or characters in a given string or array of characters, convert any string into an array of byte (TBytes), and other functionality that you may need with regard to the specific encoding of a given string or array of chars.

The TEncoding class includes the following class properties that give you singleton access to a TEncoding instance of the given encoding:

    class property ASCII: TEncoding read GetASCII;
    class property BigEndianUnicode: TEncoding read GetBigEndianUnicode;
    class property Default: TEncoding read GetDefault;
    class property Unicode: TEncoding read GetUnicode;
    class property UTF7: TEncoding read GetUTF7;
    class property UTF8: TEncoding read GetUTF8;

The Default property refers to the ANSI active codepage. The Unicode property refers to UTF-16.

TEncoding also includes the

class function TEncoding.GetEncoding(CodePage: Integer): TEncoding;

that will return an instance of TEncoding that has the affinity for the code page passed in the parameter.

In addition, it includes following function:

function GetPreamble: TBytes;

which will return the correct BOM for the given encoding.

TEncoding is also interface compatible with the .Net class called Encoding.

TStringBuilder

The RTL now includes a class called TStringBuilder. Its purpose is revealed in its name – it is a class designed to “build up” strings. TStringBuilder contains any number of overloaded functions for adding, replacing, and inserting content into a given string. The string builder class makes it easy to create single strings out of a variety of different data types. All of the Append, Insert, and Replace functions return an instance of TStringBuilder, so they can easily be chained together to create a single string.

For example, you might choose to use a TStringBuilder in place of a complicated Format statement. For instance, you might write the following code:

procedure TForm86.Button2Click(Sender: TObject);
var
  MyStringBuilder: TStringBuilder;
  Price: double;
begin
  MyStringBuilder := TStringBuilder.Create('');
  try
    Price := 1.49;
    Label1.Caption := MyStringBuilder.Append('The apples are $').Append(Price). 
             ?Append(' a pound.').ToString;
  finally
    MyStringBuilder.Free;
  end;
end;

TStringBuilder is also interface compatible with the .Net class called StringBuilder.

Declaring New String Types

Tiburon’s compiler enables you to declare your own string type with an affinity for a given codepage. There is any number of code pages available. (MSDN has a nice rundown of available codepages.) For instance, if you require a string type with an affinity for ANSI-Cyrillic, you can declare:

type
  // The code page for ANSI-Cyrillic is 1251
  CyrillicString = type Ansistring(1251);

And the new String type will be a string with an affinity for the Cyrillic code page.

Additional RTL Support for Unicode

The RTL adds a number of routines that support the use of Unicode strings.

StringElementSize

StringElementSize returns the typical size for an element (code point) in a given string. Consider the following code:

procedure TForm88.Button3Click(Sender: TObject);
var
  A: AnsiString;
  U: UnicodeString;
begin
  A := 'This is an AnsiString';
  Memo1.Lines.Add('The ElementSize for an AnsiString is: ' + IntToStr(StringElementSize(A)));
  U := 'This is a UnicodeString';
  Memo1.Lines.Add('The ElementSize for an UnicodeString is: ' + IntToStr(StringElementSize(U)));
end;

The result of the code above will be:

The ElementSize for an AnsiString is: 1
The ElementSize for an UnicodeString is: 2

StringCodePage

StringCodePage will return the Word value that corresponds to the codepage for a given string.

Consider the following code:

procedure TForm88.Button2Click(Sender: TObject);
type
  // The code page for ANSI-Cyrillic is 1251
  CyrillicString = type AnsiString(1251);
var
  A: AnsiString;
  U: UnicodeString;
  U8: UTF8String;
  C: CyrillicString;
begin
  A := 'This is an AnsiString';
  Memo1.Lines.Add('AnsiString Codepage: ' + IntToStr(StringCodePage(A)));
  U := 'This is a UnicodeString';
  Memo1.Lines.Add('UnicodeString Codepage: ' + IntToStr(StringCodePage(U)));
  U8 := 'This is a UTF8string';
  Memo1.Lines.Add('UTF8string Codepage: ' + IntToStr(StringCodePage(U8)));
  C := 'This is a CyrillicString';
  Memo1.Lines.Add('CyrillicString Codepage: ' + IntToStr(StringCodePage(C)));
end;

The above code will result in the following output:

The Codepage for an AnsiString is: 1252
The Codepage for an UnicodeString is: 1200
The Codepage for an UTF8string is: 65001
The Codepage for an CyrillicString is: 1251

Other RTL Features for Unicode

There are a number of other routines for converting strings of one codepage to another. Including:

UnicodeStringToUCS4String
UCS4StringToUnicodeString
UnicodeToUtf8
Utf8ToUnicode

In addition the RTL also declares a type called RawByteString which is a string type with no encoding affiliated with it:

  RawByteString = type AnsiString($FFFF);

The purpose of the RawByteString type is to enable the passing of string data of any code page without doing any codepage conversions. This is most useful for routines that do not care about specific encoding, such as byte-oriented string searches.Normally, this would mean that parameters of routines that process strings without regard for the strings code page should be of type RawByteString. Declaring variables of type RawByteString should rarely, if ever, be done as this can lead to undefined behavior and potential data loss.

In general, string types are assignment compatible with each other.

For instance:

MyUnicodeString := MyAnsiString;

will perform as expected – it will take the contents of the AnsiString and place them into a UnicodeString. You should in general be able to assign one string type to another, and the compiler will do the work needed to make the conversions, if possible.

Some conversions, however, can result in data loss, and one must watch out this when moving from one string type that includes Unicode data to another that does not. For instance, you can assign UnicodeString to an AnsiString, but if the UnicodeString contains characters that have no mapping in the active ANSI code page at runtime, those characters will be lost in the conversion. Consider the following code:

procedure TForm88.Button4Click(Sender: TObject);
var
  U: UnicodeString;
  A: AnsiString;
begin
  U := 'This is a UnicodeString';
  A := U;
  Memo1.Lines.Add(A);
  U := 'Добро пожаловать в мир Юникода с использованием Дельфи 2009!!';
  A := U;
  Memo1.Lines.Add(A);
end;

The output of the above when the current OS code page is 1252is:

This is a UnicodeString
????? ?????????? ? ??? ??????? ? ?????????????? ?????? 2009!!

As you can see, because Cyrillic characters have no mapping in Windows-1252, information was lost when assigning this UnicodeString to an AnsiString. The result was gibberish because the UnicodeString contained characters not representable in the code page of the AnsiString, those characters were lost and replaced by the question mark when assigning the UnicodeString to the AnsiString.

SetCodePage

SetCodePage, declared in the System.pas unit as

procedure SetCodePage(var S: AnsiString; CodePage: Word; Convert: Boolean);

is a new RTL function that sets a new code page for a given AnsiString. The optional Convert parameter determines if the payload itself of the string should be converted to the given code page. If the Convert parameter is False, then the code page for the string is merely altered. If the Convert parameter is True, then the payload of the passed string will be converted to the given code page.

SetCodePage should be used sparingly and with great care. Note that if the codepage doesn’t actually match the existing payload (i.e. Convert is set to False), then unpredictable results can occur. Also if the existing data in the string is converted and the new codepage doesn’t have a representation for a given original character, data loss can occur.

Getting TBytes from Strings

The RTL also includes a set of overloaded routines for extracting an array of bytes from a string. As we’ll see in Part III, it is recommended that instead of using string as a data buffer, you use TBytes instead. The RTL makes it easy by providing overloaded versions of BytesOf() that takes as a parameter the different string types.

Conclusion

Tiburon’s Runtime Library is now completely capable of supporting the new UnicodeString. It includes new classes and routines for handling, processing, and converting Unicode strings, for managing codepages, and for ensuring an easy migration from earlier versions.

In Part III, we’ll cover the specific code constructs that you’ll need to look out for in ensuring that your code is Unicode ready.

（出处：http://dn.codegear.com/article/38498）

---
本文章使用“国华软件”出品的博客内容管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-11 22:46 cpploverr 阅读(260) | 评论 (0) | 编辑收藏

Delphi in a Unicode World Part I

---

Delphi in a Unicode World Part I: What is Unicode, Why do you need it, and How do you work with it in Delphi?

By: Nick Hodges

原文链接：http://dn.codegear.com/article/38437

Abstract: This article discusses Unicode, how Delphi developers can benefit from using Unicode, and how Unicode will be implemented in Delphi 2009.

Introduction

The Internet has broken down geographical barriers that enable world-wide software distribution. As a result, applications can no longer live in a purely ANSI-based environment. The world has embraced Unicode as the standard means of transferring text and data. Since it provides support for virtually any writing system in the world, Unicode text is now the norm throughout the global technological ecosystem.

What is Unicode?

Unicode is a character encoding scheme that allows virtually all alphabets to be encoded into a single character set. Unicode allows computers to manage and represent text most of the world’s writing systems. Unicode is managed by The Unicode Consortium and codified in a standard. More simply put, Unicode is a system for enabling everyone to use each other’s alphabets. Heck, there is even a Unicode version of Klingon.

This series of articles isn’t meant to give you a full rundown of exactly what Unicode is and how it works; instead it is meant to get you going on using Unicode within Delphi 2009. If you want a good overview of Unicode, Joel Spolsky has a great article entitled “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)” which is highly recommended reading. As Joel clearly points out “IT’S NOT THAT HARD”. This article, Part I of III, will discuss why Unicode is important, and how Delphi will implement the new UnicodeString type.

Why Unicode?

Among the many new features found in Delphi 2009 is the imbuing of Unicode throughout the product. The default string in Delphi is now a Unicode-based string. Since Delphi is largely built with Delphi, the IDE, the compiler, the RTL, and the VCL all are fully Unicode-enabled.

The move to Unicode in Delphi is a natural one. Windows itself is fully Unicode-aware, so it is only natural that applications built for it, use a Unicode string as the default string. And for Delphi developers, the benefits don’t stop merely at being able to use the same string type as Windows.

The addition of Unicode support provides Delphi developers with a great opportunity. Delphi developers now can read, write, accept, produce, display, and deal with Unicode data – and it’s all built right into the product. With only few, or in some cases to zero code changes, your applications can be ready for any kind of data you, your customers or your users can throw at it. Applications that previously restricted to ANSI encoded data can be easily modified to handle almost any character set in the world.

Delphi developers will now be able to serve a global market with their applications -- even if they don’t do anything special to localize or internationalize their applications. Windows itself supports many different localized versions, and Delphi applications need to be able to adapt and work on machines running any of the large number of locales that Windows supports, including the Japanese, Chinese, Greek, or Russian versions of Windows. Users of your software may be entering non-ANSI text into your application or using non-ANSI based path names. ANSI-based applications won’t always work as desired in those scenarios. Windows applications built with a fully Unicode-enabled Delphi will be able to handle and work in those situations. Even if you don’t translate your application into any other spoken languages, your application still needs to be able to work properly -- no matter what the end user’s locale is.

For existing ANSI-based Delphi applications, then opportunity to localize applications and expand the reach of those applications into Unicode-based markets is potentially huge. And if you do want to localize your applications, Delphi makes that very easy, especially now at design-time. The Integrated Translation Environment (ITE) enables you to translate, compile, and deploy an application right in the IDE. If you require external translation services, the IDE can export your project in a form that translators can use in conjunction with the deployable External Translation Manager. These tools work together with the Delphi IDE for both Delphi and C++Builder to make localizing your applications a smooth and easy to manage process.

The world is Unicode-based, and now Delphi developers can be a part of that in a native, organic way. So if you want to be able to handle Unicode data, or if you want to sell your applications to emerging and global markets, you can do it with Delphi 2009.

A Word about Terminology

Unicode encourages the use of some new terms. For instance the idea of “character” is a bit less precise in the world of Unicode than you might be used to. In Unicode, the more precise term is “code point”. In Delphi 2009, the SizeOf(Char) is 2, but even that doesn’t tell the whole story. Depending on the encoding, it is possible for a given character to take up more than two bytes. These sequences are called “Surrogate Pairs”. So a code point is a unique code assigned an element defined by the Unicode.org. Most commonly that is a “character”, but not always.

Another term you will see in relation to Unicode is “BOM”, or Byte Order Mark, and that is a very short prefix used at the beginning of a text file to indicate the type of encoding used for that text file. MSDN has a nice article on what a BOM is. The new TEncoding Class (to be discussed in Part II) has a class method called GetPreamble which returns the BOM for a given encoding.

Now that all that has been explained, we’ll look at how Delphi 2009 implements a Unicode-based string.

The New UnicodeString Type

The default string in Delphi 2009 is the new UnicodeString type. By default, the UnicodeString type will have an affinity for UTF-16, the same encoding used by Windows. This is a change from previous versions which had AnsiString as the default type. The Delphi RTL has in the past included the WideString type to handle Unicode data, but this type is not reference-counted as the AnsiString type is, and thus isn’t as full-featured as Delphi developers expect the default string to be.

For Delphi 2009, a new UnicodeString type has been designed, that incorporates the capabilities of both the AnsiString and WideString types. A UnicodeString can contain either a Unicode-sized character, or an ANSI byte-sized character. (Note that both the AnsiString and WideString types will remain in place.) The Char and PChar types will map to WideChar and PWideChar, respectively. Note, as well, that no string types have disappeared. All the types that developers are used to still exist and work as before.

However, for Delphi 2009, the default string type will be equivalent to UnicodeString. In addition, the default Char type is WideChar, and the default PChar type is PWideChar.

That is, the following code is declared by the compiler:

type
  string = UnicodeString;
  Char = WideChar;
  PChar = PWideChar;

UnicodeString is assignment compatible with all other string types; however, assignments between AnsiStrings and UnicodeStrings will do type conversions as appropriate. Thus, an assignment of a UnicodeString type to an AnsiString type could result in data-loss. That is, if a UnicodeString contains high-order byte data, a conversion of that string to AnsiString will result in a loss of that high-order byte data.

The important thing to note here is that this new UnicodeString behaves pretty much like strings always have (with the notable exception of their ability to hold Unicode data, of course). You can still add any string data to them, you can index them, you can concatenate them with the ‘+’ sign, etc.

For example, instances of a UnicodeString will still be able to index characters. Consider the following code:

 var
   MyChar: Char;
   MyString: string;
 begin
   MyString := ‘This is a string’;
   MyChar := MyString[1];
 end;

The variable MyChar will still hold the character found at the first index position, i.e. ‘T’. This functionality of this code hasn’t changed at all. Similarly, if we are handling Unicode data:

 var
   MyChar: Char;
   MyString: string;
 begin
   MyString := ‘世界您好‘;
   MyChar := MyString[1];
 end;

The variable MyChar will still hold the character found at the first index position, i.e. ‘世’.

The RTL provides helper functions that enable users to do explicit conversions between codepages and element size conversions. If the user is using the Move function on the character array, they cannot make assumptions about the element size.

As you can imagine, this new string type has ramifications for existing code. With Unicode, it is no longer true that one Char represents one Byte. In fact, it isn’t even always true that one Char is equal to two bytes! As a result, you may have to make some adjustments to your code. However, we’ve worked very hard to make the transition a smooth one, and we are confident that you’ll be able to be up and running quite quickly. Parts II and III of this series will discuss further the new UnicodeString type, talk about some of the new features of the RTL that support Unicode enablement, and then discuss specific coding idioms that you’ll want to look for in your code. This series should help make your transition to Unicode a smooth and painless endeavor.

Conclusion

With the addition of Unicode as the default string, Delphi can accept, process, and display virtually any alphabet or code page in the world. Applications you build with Delphi 2009 will be able to accept, display, and handle Unicode text with ease, and they will work much better in almost any Windows locale. Delphi developers can now easily localize and translate their applications to enter markets that they have previously been more difficult to enter. It’s a Unicode world out there, and now your Delphi apps can live in it.

In Part II, we’ll discuss the changes and updates to the Delphi Runtime Library that will enable you to work easily with Unicode strings.

以下为有道自动翻译(Delphi园地站长注：翻译不是很准确，如有读者有兴趣翻译，请发给我们发布，谢谢）：

在本部分德尔世界是什么,为什么你需要制定,你如何工作的呢?
　　通过:尼克霍奇
　　
　　文摘:论述了如何制定,Delphi开发商受益于使用统一的字符编码标准,以及如何将实施Delphi2009年。
　　介绍
　　互联网地理障碍,打破世界软件分布。作为一个结果,应用再也不能住在一个纯ANSI-based环境。世界已接受的标准的制定和数据。转移文本它提供了支持,几乎所有的书写体系在全世界范围内,统一的字符编码标准文本现在的标准是全球科技的生态系统。
　　
　　什么是统一的字符编码标准吗?
　　本是字符编码系统使几乎所有字母被编码成一个单一的字符集。让计算机管理,制定表示文本,世界上大部分的书写系统。本协会是由统一的字符编码标准并编入在一个标准。更简单地说,就是一个系统,使本都使用对方的字母。见鬼,甚至有一个统一的字符编码标准版Klingon。
　　
　　这系列的文章并不意味着给你一份完全破旧的到底是什么Unicode工作;相反,它是为了让您将在2009年在Delphi使用本。如果你想好概要的,乔尔Spolsky本有很大的文章题为“最低限度每个软件开发者绝对地、肯定地必须了解Unicode字符(没有藉口,!)”,它是高度推荐阅读。作为乔明确指出“这不难。”这篇文章中,我将讨论的3本是重要的,以及如何将如何实施新的UnicodeString德尔菲的类型。
　　
　　为什么Unicode吗?
　　在许多新功能在2009年发现的imbuing德尔菲的整个产品。本默认的字符串在Delphi现在Unicode-based字符串。从很大程度上与德尔菲德尔菲的IDE,编译器、RTL、VCL都完全Unicode-enabled。
　　
　　在转会到Delphi是一个天然的一个。窗户本身就是完全Unicode-aware,所以它只是自然的应用程序,使用了建造的字符串作为默认Unicode字符串。和德尔菲的开发者、效益不停止仅仅在能够使用相同的字符串类型作为窗口。
　　
　　增加Delphi开发者以支持提供了巨大的机会。现在可以Delphi开发商的读、写、接受、生产、展览、处理Unicode——这些数据集成于产品。只有几个,或在某些情况下,你的代码变更为零应用可以准备任何类型的数据,你,你的客户或你的用户可以把它。应用程序,以前只限于美国国家标准化组织(ANSI)编码资料,可以很方便地进行修改来处理任何字符集,在世界上。
　　
　　Delphi开发者可以作为全球市场中的应用——即使他们不做任何特别的局部或国际化的应用。支持多种不同的局部窗口本身的版本,Delphi应用程序需要能够适应工作的任何机器运行大量的场景,包括了,窗户支持日本、中国、希腊、或俄罗斯版本的视窗。用户可以进入你的软件应用到你non-ANSI non-ANSI或使用基于路径名。ANSI-based应用不会一直工作所需的那些场景。视窗系统应用具有完全Unicode-enabledDelphi将能够处理和工作的情况。即使你不把你的应用程序在任何其他种语言,你的应用还需要能够正常工作——无论如何在最终用户的场所。
　　
　　对现有ANSI-basedDelphi申请书,并应用和扩大机遇来定位的应用是潜在的巨大市场进入Unicode-based。如果你确实想要让你的应用程序中,德尔斐定位,非常容易,尤其是现在在设计。尽管综合翻译环境(翻译)允许你编写,和部署,申请的权利。如果你需要外部的翻译服务,IDE可汇出您的项目可以使用一种译者在翻译会同部署的外部经理。这些工具与DelphiIDE对于德尔菲法和C + + Builder使本地化软件平滑而易于处理的过程。
　　
　　世界是Unicode-based,现在Delphi开发者可以成为你生活的一部分,在本地、有机方式。所以,如果你想要有能力处理数据,或者如果你本想卖掉你的应用程序和全球市场出现时,你也可以做到Delphi2009年。
　　
　　一个字有关术语
　　鼓励使用一些制定新条款。比如“品质”的理念是有点不太准确的世界里,你可能会比使用本。在统一的字符编码标准,更精确的说法是“密码”。2009年的长度(炭)是2,但也没有告诉我们全部的故事。根据编码,它是可能的,对于一个给定字符占用超过二个字节。这些序列被称为“代孕双”。所以一个代码是一个独特的编码指定一个元素被Unicode.org。最常用的是“角色”,但并不总是。
　　
　　你会看到另一个词是“关系”,本订单或字节的炸弹,那是一种非常短的前缀使用之初,一个文本文件来显示类型的编码用于文本文件。有一个好的文章MSDN炸弹是什么。新的TEncoding班(讨论)在第二部分类方法,称为GetPreamble返回BOM对于一个给定的编码。
　　
　　现在,一切都说明,我们要看看如何实现一个Unicode-basedDelphi2009年的字符串。
　　
　　新的UnicodeString类型
　　默认的字符串在2009年新UnicodeString德尔菲的类型。默认情况下,UnicodeString类型会有亲和力,同样的编码为UTF-16所用的窗口。这是一个从以前的版本,具有AnsiString设为默认的类型。德尔菲RTL已经在过去的数据类型来处理WideString制定的,但是这种不是reference-counted AnsiString型的,因此并不像预期的一样Delphi开发商将默认的字符串。
　　
　　对于一个新UnicodeString德尔菲2009年,设计,类型都包含了能力,WideString AnsiString类型。一个UnicodeString可以包含一个字,或一个Unicode-sized ANSI byte-sized字符。(注意:双方AnsiString WideString类型,将继续存在。)贾泽民、PChar的类型将地图,分别WideChar PWideChar。注意,没有字符串类型已经消失了。所有的类型,开发者习惯于依然存在的情况下工作。
　　
　　然而,Delphi2009年,默认字符串类型将相当于UnicodeString。此外,默认是WideChar炙、类型的默认PChar PWideChar类型。
　　
　　那就是,下面的代码被宣布由编译器。
　　
　 string = UnicodeString;
Char = WideChar;
PChar = PWideChar;

　　
　　兼容所有作业UnicodeString是其他字符串类型;然而,作业和UnicodeStrings AnsiStrings之间做适当的类型转换。因此,赋值类型的一个AnsiString UnicodeString data-loss类型可能导致。这就是说,如果一个UnicodeString含有高阶字节数据转换的那根绳子,将导致损失AnsiString高字节的数据。
　　
　　最重要的是要注意这是这个新的UnicodeString表现得很像串总是(有例外的能力,当然Unicode数据)。你还可以加入任何一个字符串数据,你可以指数,可以连结的“+”号签署,等等。
　　
　　例如,一个UnicodeString仍能够指标特征。考虑以下代码:
　　
　　var
   MyChar: Char;
   MyString: string;
   begin
     MyString := ‘This is a string’;
     MyChar := MyString[1];
   end;

　　
　　这个变量MyChar仍然会出现在第一个字符的索引位置。”,即“。这一功能的代码并没有改变。同样的,如果我们要处理Unicode数据:
　　
　　var
   MyChar: Char;
   MyString: string;
   begin
     MyString := ‘世界您好‘;
     MyChar := MyString[1];
   end;

　　
　　这个变量MyChar仍然会出现在第一个字符的索引,即“世位置。”。
　　
　　致力于提供帮助的功能,让用户做codepages元素之间的显式转换,大小转换。如果用户使用移动功能上的角色,他们不能数组元素的尺寸的假设。
　　
　　可以想象,这个新的字符串类型,对现有的代码。以Unicode,它已不再是真实的,代表了一种煤焦字节。事实上,它不是真实的,甚至常常炙等于两个字节!作为一种结果,你可能不得不做一些调整你的代码。然而,我们非常努力地过渡平稳,我们有信心,你就能够建立并运行相当快。第二和第三部分的系列将进一步探讨新UnicodeString类型,谈论一些新的特点,支持”,然后讨论特定伺服器、编码成语,你会想要找你的代码。该系列会让你的过渡平稳,无痛性努力制定。
　　
　　结论
　　再加上Unicode设为默认的字符串,Delphi可以接受,工艺,显示几乎任何字母或代码页,在世界上。你建立与应用德尔菲2009年将能够接受,显示和处理Unicode文本,以方便的话,他们会发挥更大的作用在几乎任何窗口区域。Delphi发展商现在可以方便地定位和翻译的申请进入市场,他们曾被更难进入。这是一个统一的字符编码标准的环境里,现在你可以住在德尔菲程序。
　　
　　在第二部分,我们将讨论的变化和更新的Delphi中运行的图书馆,而且会让你更容易工作与万国码字串。

（出处：http://dn.codegear.com/article/38437）

本文章使用“国华软件”出品的博客内容管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-11 22:43 cpploverr 阅读(136) | 评论 (0) | 编辑收藏

怎样制作CHM格式的电子书？

CHM帮助文件是什么？

原来的软件大多数采用扩展名为HLP的帮助文件（WinHelp ），但随着互联网的发展，这种格式的帮助文件已经难以适应软件在线帮助的需要，以及更加人性化更加简单易于查看的需要，因此一种全新的帮助文件系统 HTML Help由微软率先在Windows98中使用了。由于它是一个经过压缩的网页集合，不但减小了文件的体积，更利于用户从INTERNET上下载，并且还支持HTML、Ac－tiveX、Java、JScript、Visual Basic Scripting 和多种图像格式(.jpeg、.gif和.png 等)，因此很快受到广大软件作者和软件用户的欢迎。不过，它的用途又何止这些？

下面，我们就以将小说《围城》制作成一部阅读方便的电子版小说为例，一步步地让您了解CHM帮助文档制作的全过程，通过这样的介绍，相信您一定能随心所欲地做出更多自己想要的电子文档。“公欲善其事，必先利其器”，还是先介绍我们的制作工具吧。

能制作CHM文件的工具最常用也是最易用的莫过于“国华软件”出品的 Easy CHM 了，它完全安装后只有4.2M，这款软件提供了大量的选项，我们可以按照自己的喜好定制自己的CHM文件。而且，比较重要的是： Easy CHM这款软件提供的目录、索引编辑器功能非常完善，可以让我们多选，拖拽任意的项目，同时，对于目录、索引里的文字，EASY CHM还支持批量替换，非常方便我们编辑。在速度方面EASY CHM也明显经过优化，加载目录速度奇快，特别在处理目录和索引项比较多的时候，EASY CHM的优势就全体现出来了。

安装完Easy CHM后，您可以在开始菜单|程序|Easy CHM中打开它。要建立CHM帮助文件必须先将我们所有想要让其出现在帮助文件中的内容做成相对独立的网页文件（即HTML文件），这些工作可以在Easy CHM中完成，也可以应用其他网页编辑器（如FontPage或HotDog）来制作。

比如说我们这里将其作为例子的《围城》，就可以按不同的章节来制作。另外，HTML文件之间应该相互链接，如每一章之间应该保留与上一章和下一章超级链接的接口，将这些相对独立的网页制作完成后，将它们储存到一个新建的文件夹中。

首先我们先介绍几种将要用到的过渡文件：

1.hhp文件，这是最常用的一类文件，它实际上是“HTML Help project”的缩写，表示HTML帮助文件项目，它是生成CHM文件最直接用到的一类文件，只需要有一个hhp文件就可以根据它来编译相应的CHM文件。

2.hhc文件，它是“HTML Help table of contents”的缩写，表示的是HTML帮助文件的目录，通常我们在CHM文件浏览器的左边窗口中看到的目录内容就由它来提供，当然，它并不能直接被编译成CHM文件，而先要集成到某一hhp文件中才能发挥作用。

3.hhk文件，它是“HTML Help Index Keyword”的缩写，为我们提供了CHM文件的关键字索引查询功能，也是一个易于查看的帮助文件不可缺少的一部分，同hhc文件一样，它也不能直接编译生成相应的CHM文件。

用Easy CHM制作CHM的简单步骤如下：

1、启动后单击工具栏的“新建”按钮，新建一个工程，选择工程目录（也就是你的网页文件所在的文件夹）。单击“确定”把所有的网页导入。
2、单击工具栏的“编译”按钮，键入CHM文件的标题（如：小椴作品典藏版v1.5），CHM的第一页和主页选择一下，选择CHM文件的保存位置。
3、单击CHM设置，在“面板”选项卡中取消勾选“目录、索引、搜索、书签”。（这里视情况而定，如果你的图书需要显示“目录、索引、搜索、书签”中的项，勾选即可。）
在“位置”选项卡中设置宽度为1024（我的是标准屏幕），宽度为738（根据你的需要设定此值），可以用“窗体定位工具”进行设置。在“其他”选项卡中取消勾选“允许CHM记忆窗体的位置和大小”。
4、点击确定后回到“编译工程为CHM”界面，单击“应用”按钮保存以上的设置，单击“生成CHM”，软件会自动生成电子图书。
5、单击“查看CHM”，看看有没有不合适的地方，有的话修改一下，没有的话我们的书就做好了。

很快，一本CHM格式的电子版小说就这样生成了，它虽然只是一个独立的文件，但CHM却可以完全脱离自己的机器、脱离Easy CHM运行（需要IE4.0以上支持）。

一个比较简单的CHM帮助文件（没有目录、没有索引）的制作过程就是这样。

经过上面的介绍，现在您一定可以自己编辑制作帮助文件和电子文档了吧，Easy CHM的用途其实还有很多，其中另一个不得不说的功能是：它不仅可以编译一个CHM文件，更可以对现有的CHM文档进行反编译，这样我们可以更直接地借鉴和使用一些制作得非常精美的帮助文档的某些精彩部分，有时搞一点“拿来主义”感觉还是不错的。

posted @ 2010-01-11 22:11 cpploverr 阅读(2045) | 评论 (3) | 编辑收藏

Delphi与Vista提供的UAC控制

Vista提供的UAC机制，是Vista的新增功能之一。它的主要目的是防止对于操作系统本身的恶意修改。如果想对于Vista的系统设置进行改动，必须通过UAC的验证才能够进行。通过这样的手段，大大提供了系统的安全性。

关于UAC的利弊，网络上的说法褒贬不一，在这里就不具体讨论了。

对于Delphi程序的影响，UAC主要在于以下几点：

1、由于UAC机制，Delphi对于系统的操作可能无声的失败，而同样的程序，在2000/X下面可能运行正常。譬如注册表的改动。。。

2、为了避免这样的情况，Delphi程序必须支持Vista UAC标注，也就是说，在UAC程序的图标下面显示盾牌标志。这样可以在需要系统更高权限的时候，提醒用户。

为了让程序显示UAC标志，现在看来Vista是通过读取程序的资源（Resource）里面的MANIFEST资源，来决定是否显示“UAC盾牌”。

为了支持Vista，Delphi程序必须在资源里面嵌入MANIFEST信息。

1、首先编辑一个文件，内容如下：

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
<trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
    <security>
      <requestedPrivileges>
        <requestedExecutionLevel level="requireAdministrator"/>
      </requestedPrivileges>
    </security>
</trustInfo>
</assembly>

保持为UAC.manifest，这里文件是随意的。特别注意红色的“requireAdministrator”，这个表示程序需要管理员（Administrator）才能正常运行。

2、然后编辑一个RC文件，名为uac.rc

1 24 UAC.manifest

其中：

1-代表资源编号

24-资源类型为RTMAINIFEST

UAC.manifest-前面的文件名称

3、用brcc32编译这个rc文件为res文件

brcc32 uac.rc -fouac.res

4、在程序里面加入

{$R uac.res}

让Delphi编译的时候，把uac.res编译进exe文件

5、把文件放到vista里面运行，就会看程序图标下面显示UAC盾牌标志了。

---
本文章使用“国华软件”出品的博客内容管理软件MultiBlogWriter撰写并发布

posted @ 2010-01-10 18:25 cpploverr 阅读(305) | 评论 (0) | 编辑收藏

仅列出标题

导航

留言簿

随笔档案

阅读排行榜

评论排行榜

常用链接

统计

最新评论

Areas That Should “Just Work”

General Use of String Types

The Runtime Library

The VCL

String Indexing

Length/Copy/Delete/SizeOf with Strings

Pointer Arithmetic on PChar

ShortString

Areas That Should be Reviewed

SaveToFile/LoadFromFile

Use of the Chr Function

Sets of Characters

Using Strings as Data Buffers

Calls to SizeOf on Buffers

Use of FillChar

Using Character Literals

Calls to Move

Read/ReadBuffer methods of TStream

Write/WriteBuffer

LeadBytes

TMemoryStream

TStringStream

MultiByteToWideChar

SysUtils.AppendStr

GetProcAddress

Use of PChar() casts to enable pointer arithmetic on non-char based pointer types

Variant open array parameters

CreateProcessW

Code to search for

Conclusion

Introduction

TCharacter Class

TEncoding Class

TStringBuilder

Declaring New String Types

Additional RTL Support for Unicode

StringElementSize

StringCodePage

Other RTL Features for Unicode

SetCodePage

Getting TBytes from Strings

Conclusion

Introduction

What is Unicode?

Why Unicode?

A Word about Terminology

The New UnicodeString Type

Conclusion