ANSI版本保存的文件，在UNICODE时是否依然能读入？

情况是这样的，一个程序从ansi移植到了unicode，见把代码移植成UNICODE （http://www.cppblog.com/flyingxu/archive/2006/05/18/7356.html），之后产生了一个问题，之前文件保存的那些文件，unicode版本还能打开吗？这真是个大问题，如果不能，就基本表示这个程序已经半死不活了。

文件的保存基本用的序列化，比如

void CSDITestDoc::Serialize(CArchive& ar)
{
    if (ar.IsStoring())
    {
        // TODO: add storing code here
        ar << m_strName;
    }
    else
    {
        // TODO: add loading code here
        ar >> m_strName;
    }
}

如果m_strName = _T("name");那么ansi情况下，保存的文件为:
04 6E 61 6D 65，一共5个字节。
然后用unicode版本保存时，是
FF FE FF 04 6E 00 61 00 6D 00 65 00,一共12个字节。

然后我发现，ansi版本的可以打开unicode版本的文件，unicode版本的也可以打开ansi版本的文件。

为什么？我觉得关键在于CString，我觉得CString功能比较强大。
CString的序列化函数

// CString serialization code
// String format:
//      UNICODE strings are always prefixed by 0xff, 0xfffe
//      if < 0xff chars: len:BYTE, TCHAR chars
//      if >= 0xff characters: 0xff, len:WORD, TCHAR chars
//      if >= 0xfffe characters: 0xff, 0xffff, len:DWORD, TCHARs

CArchive& AFXAPI operator<<(CArchive& ar, const CString& string)
{
    // special signature to recognize unicode strings
#ifdef _UNICODE
    ar << (BYTE)0xff;
    ar << (WORD)0xfffe;
#endif

    if (string.GetData()->nDataLength < 255)
    {
        ar << (BYTE)string.GetData()->nDataLength;
    }
    else if (string.GetData()->nDataLength < 0xfffe)
    {
        ar << (BYTE)0xff;
        ar << (WORD)string.GetData()->nDataLength;
    }
    else
    {
        ar << (BYTE)0xff;
        ar << (WORD)0xffff;
        ar << (DWORD)string.GetData()->nDataLength;
    }
    ar.Write(string.m_pchData, string.GetData()->nDataLength*sizeof(TCHAR));
    return ar;
}

它对unicode做了特别的支持
特别是当它在读序列化过程的时候

CArchive& AFXAPI operator>>(CArchive& ar, CString& string)
{
#ifdef _UNICODE
    int nConvert = 1;   // if we get ANSI, convert
#else
    int nConvert = 0;   // if we get UNICODE, convert
#endif

    UINT nNewLen = _AfxReadStringLength(ar);
    if (nNewLen == (UINT)-1)
    {
        nConvert = 1 - nConvert;
        nNewLen = _AfxReadStringLength(ar);
        ASSERT(nNewLen != -1);
    }

    // set length of string to new length
    UINT nByteLen = nNewLen;
#ifdef _UNICODE
    string.GetBufferSetLength((int)nNewLen);
    nByteLen += nByteLen * (1 - nConvert);  // bytes to read
#else
    nByteLen += nByteLen * nConvert;    // bytes to read
    if (nNewLen == 0)
        string.GetBufferSetLength(0);
    else
        string.GetBufferSetLength((int)nByteLen+nConvert);
#endif

    // read in the characters
    if (nNewLen != 0)
    {
        ASSERT(nByteLen != 0);

        // read new data
        if (ar.Read(string.m_pchData, nByteLen) != nByteLen)
            AfxThrowArchiveException(CArchiveException::endOfFile);

        // convert the data if as necessary
        if (nConvert != 0)
        {
#ifdef _UNICODE
            CStringData* pOldData = string.GetData();
            LPSTR lpsz = (LPSTR)string.m_pchData;
#else
            CStringData* pOldData = string.GetData();
            LPWSTR lpsz = (LPWSTR)string.m_pchData;
#endif
            lpsz[nNewLen] = '\0';    // must be NUL terminated
            string.Init();   // don't delete the old data
            string = lpsz;   // convert with operator=(LPWCSTR)
            CString::FreeData(pOldData);
        }
    }
    return ar;
}

也就是说，不管你保存的文件是不是unicode，都可以读进来，转换称当前unicode或者ansi版本

不过回到文章的标题，这里只讲了用CString序列化的保存。如果是其他保存方式，可以学着CString的做法，写函数把文件总是能读进来。

posted on 2006-06-13 23:53 flyingxu 阅读(1317) 评论(0) 编辑收藏引用所属分类: VC/MFC

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！

相关文章: 我的新博客在 http://www.codediscuss.com Memory leak in CWinThread? 再调试状态下，按下F12，程序就出错了（VC） Is MSDN wrong? or I made a mistake? about static member function [zz]MFC返回的临时对象指针成因? bug： OnIdle called only while mouse's moving _asm int 3 的一个应用：在程序运行后再进入调试器 ANSI版本保存的文件，在UNICODE时是否依然能读入？ MFC程序中，自定义的字符串id范围是什么？编译优化选项不匹配引起的一个bug

网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理

love in C++, live on MFC

常用链接

随笔分类

随笔档案

My other blogs

Other Blog

搜索

最新评论

阅读排行榜

评论排行榜