Is the lack of privacy a real shortcoming of the language, or is our judgment clouded by the old conventions of C++ and Java? Why do we need private variables anyway -- at what point does defensive programming become paranoia?
In my last Python article I wrote a short implementation of a read only attribute in a Python class, whilst noting that if you were determined enough you could still change anything you like in a Python object. I've got a couple comments since then asking why I didn't just make the attribute private, or asking if Python supported private variables. In short, it doesn't -- there's no 'private' keyword and no other way of designating fields as private.
Still, it got me wondering: is it possible to somehow implement private variables? Python gives us a lot of flexibility to change how variables are accessed, it is possible to somehow protect class variables?
To be completely private, the field has to be accessible by any means that originate from inside the class it exists in, but not by any other means. This means not only referencing the field normally -- that is, i.x for field x of object i -- but also taking it from the object dictionary.
I spent the morning trying different strategies to implement a class with private variables -- using either old-style or new-style classes. You'd have to override __getattr__ (or __getattribute__ instead in new-style classes) of course, and you can use the inspect module to examine the interpreter frame so you can tell which attempts to access came from a local method and which don't.
To cut a long story short it's not possible, by any means that I can see -- if you can prove me wrong, send me an email and we'll publish your solution.
The reason for this is a consequence of how Python treats objects: in Python everything is a dictionary. In operation, classes are little more than syntactic sugar for dictionaries. When you call, for example, the i.add method the interpreter looks firstly in the i.__dict__ dictionary, then the dictionaries belonging to it's class and base classes in the method resolution order. Whichever field matches first is called.
Now, because traditional access to attributes and methods go through the __getattr__ or __getattribute__ methods you can intercept them there, and it is possible in a number of ways to implement access control for the dot-form, but you can't restrict access to the dictionary.
Even worse, since the object's dictionary is used to store and retrieve attributes and methods, you can't mess with it and expect the class to behave in the same way -- if you delete the entry relating to a method in an instance, that method cannot be used, from inside or outside the the class. If you edit the dictionary of the class the instance belongs to, it affects all instances of that class.
Is this such a bad thing? Why is it that we want to lock people out of our classes? I've never really thought about it before now: if a field wasn't used outside the class in my reference code, then it was dutifully protected from the outside world. The problem with this kind of programming is that it leads you into thinking of the people using your code as malicious.
There's a lot to be said for defensive programming, but after a certain point it's reducing the power of our code. I could see the argument towards field safety if you were distributing a library and you wanted to reduce the possible bugs others could write using your code, but for the majority of cases it appears that it's motivated out of a fear that users will tamper with the programmer's perfect code, in nothing less than a malicious attempt to destroy its purity.
Do we resent others using our work outside our intentions? Some really useful things have been accomplished by using code in ways it's authors had never intended (and probably never wanted, for that matter).
Next time you reach for the 'private' keyword, have a good think about whether it's necessary, or whether you're reducing the potential of your code.




1
rhlowe - 06/11/07
One reason I use access modifiers in my code is to remind me what it is doing. I know I could use comments, but if I don't have to I try not to.
» Report offensive content
2
Greg M - 06/11/07
You need to realize that all code is library code. It's going to be reused, it's going to be maintained. Getting the interfaces right is one of the most important parts of the design process, and that's true at every level of your abstraction hierarchy.
The smaller the interface can be and remain functional, the better. The protect keyword makes your interface smaller. Documentation should be concentrated at interface boundaries. The protect keyword helps to document your interface, in a way that the compiler can guarantee is actually true.
» Report offensive content
3
sth - 07/11/07
You can wrap the value in a function closure. This way the value itself doesn't show up in dict(), just the function that produces the value.
That said, I wouldn't recommend anyone to actually use this. For normal class interface purposes it should be absolutely enough to make clear to the user which variables are "private", like by prefixing their names with "_". So everybody who wants to use/change them knows this is dangerous/unsafe/unsupported.
» Report offensive content
4
NickGibson - 07/11/07
That's a clever attempt sth - and you're right, using a functional closure means that you can't get at the value thats been set once it's gone through.
That doesn't make it private however, since any user can simply monkeypatch out your protection if they really want access.
» Report offensive content
5
sth - 07/11/07
With some trickery I managed to pack it all in one big closure, so the value and the accessor functions should be completely hidden and even the inspect-module isn't necessary anymore. It also leads to some side effects like type(Example()) != type(Example()), so nobody should use this just because they would like some private class members.
» Report offensive content
6
that dude - 07/11/07
man, that's good info
» Report offensive content
7
Mark - 09/11/07
Coming back to the "why private" question... the statement "Some really useful things have been accomplished by using code in ways it's authors had never intended..." is a good example of why.
It isn't the creation that is the problem, it is the maintenance. Private is one of the things that gives you the opportunity to change something safely. If nothing is private you can eventually lead back to something that is impossible to maintain.
A lot of developers (particularly "clever' ones) tend to forget that the code the write today could be around for 20 years or more. I know some code I wrote back in the mid 80s is still in use today.
The more "clever" something is, often the harder it is to maintain. Someone coming along 5+ years from now has to make a minor change... they can spend days just trying to figure out what the hell it does. So they don't. Instead they add something new. The code just gets bloated.
A lot of "clever" developers also forget that not everyone who will be maintaining the code they write will be as clever as them. Remember the perl obfuscation competitions? Just because you can do something clever doesn't mean you should.
Keep it simple (and private where applicable)
Use interfaces
(yes this is a vote for languages where you can make things private)
» Report offensive content
8
serj - 26/05/08
How about my blog
http://python-a-day.blogspot.com/
It's not state of the art but i do try to publish every interesting thing i do with python
» Report offensive content
9
Daniel - 24/07/08
serj, that was completely off topic.
On that note, I think that as long as you make you sure you document and/or comment your code enough, private/protect shouldn't be necessary
» Report offensive content
10
Daniel Gara - 24/09/08
It's like dropping tacks on the road...the guy chasing you can drive around them, but it WILL slow him down! :)
» Report offensive content
11
vlad - 11/11/09
''It's like dropping tacks on the road...the guy chasing you can drive around them, but it WILL slow him down! :)''.
I agree, my method is to use 512 bit variable name generate by Whirlpool and a random number generator. When variables look like this ''_3CCF8252D8BBB258460D9AA999C06EE38E67CB546CFFCF48E91F700F6FC7C183AC8CC3D3096DD30A35B01F4620A1E3A20D79CD5168544D9E1B7CDF49970E87F1"
» Report offensive content