This project has moved and is read-only. For the latest updates, please go here.

formatting object in MagicText

Aug 12, 2016 at 4:50 PM
My application under Windows 10-64 bits, use Docx to read .docx Word documents and analyze their contents.
Thus it accesses to paragraphs and "MagicText" contained therein.
Each MagicText contains a "formatting" field, sometimes Null (Nothing in VB.Net).
A "formatting" contains a series of fields, such as Bold, Italic, etc.

My application tests Bold and Italic, which are typically Boolean. Good.

In the previous version I used Docx (V1.0.0.8) these two fields were never null.
In V1.0.0.22 version I just installed, woe, with the same input documents these fields can be null which crash the application ...

This regression requires first test if the field is initialized before assigning its value.
example in VB.Net:
    Public ReadOnly CurrentFormat As Formatting = Magic.formatting

    Public ReadOnly Property __IsBold__ As Boolean
            If (CurrentFormat Is Nothing) Then Return False
            Return IIf(CurrentFormat.Bold Is Nothing, False, CurrentFormat.Bold)
        End Get
    End Property
In general, why typed fields of "formatting" (especially boolean) they take a Null value?

Thank you for your reply.

Aug 15, 2016 at 8:18 PM
I don't remember now why it was implemented but from to (or even further when it comes to what's in sources on GitHub) there was a huge amount of changes. You would have to go thru commits and see when it changed, and why. There had to be a reason.
Aug 15, 2016 at 9:51 PM
MadBoy thank you for your reply.
However it seems a little "banal". Yes, there certainly is a reason, but I wish that it explained to me. Why can these boolean be NULL? For example a text is in Italic or not. Ditto for Bold.
Aug 15, 2016 at 9:59 PM
Well there are 3 states actually.

None, True and False. If it's none then no formatting is applied. Think of it as clear text. Considering that each char can be made italic, bold, underlined or other you can have a lot of information for a single word, and a lot of formatting options for whole document.

So I would say:
  • none is used when formatting should be standard
  • true / false is used when you explicitly are saying one or another
If i remember correctly formatting takes null for replace, search or matching of formats when searching where it's not always needed but can't really find a reason now.
Aug 15, 2016 at 10:05 PM
You can also imagine yourself this:
  • Take Hello World
  • Make each letter bold and then unbold
  • You now have 10 formatting options set even thou nothing is set
  • Your document now contains a lot more useless data
Aug 15, 2016 at 10:52 PM
Thank You MadBoy for your answare,

There is something I do not understand:

In a paragraph to each text format change is a new MagicText is not it?
A MagicText may include a formatting object (if it's an object).
If the Bold field, for example, is initialized (true or false) or if it is not (Nothing), this does not change the volume of data, I think. Yes ?
Aug 16, 2016 at 11:17 AM
Like I said above. There is a reason why used null instead of only true/false. If you wish to explore why please find source code and explore. I'm not up to date on why particular change was made. It may be that in your particular case doesn't make sense. You can probably write a method around it to allow only true/false.
Aug 16, 2016 at 1:16 PM
After a night of reflection (!) I think the reason may be the economy treatment, not that of the memory occupation.
Thank you for your answers anyway.
Docx is a helpful source!

Have a good day.
Marked as answer by MadBoy on 8/18/2016 at 7:42 AM