Proplist references

Not logged inOpenClonk Forum

Forum Home Help Search Watchlist Register Login

Topic Development / Developer's Corner / Proplist references

Post

By Günther [de]

Date 2010-03-22 21:17

Problem:

  var a = { foo = 1 }, b = { Prototype=a };
  b["foo"] = 2;
  Log("%d %d", a["foo"], b["foo"]);

At the moment, this logs "2 2", but the intended output is "1 2".

So, I got the idea that references to Proplist entries (and probably array entries, too) could store the proplist/array and key instead of a pointer to the entry. By using the fact that you can't store references in proplists and arrays, C4Value wouldn't grow. This way, the missing entry could be created when it's accessed through the reference. One obstacle is that we need to ensure that the entry is not created when it's only accessed for reading. Unfortunately, the C4Value API does not make this distinction at the moment, so large parts of the engine would need to be checked. Also, I'm not sure whether to implement these references using a new C4V_Type, or by using some of the leftover padding in C4Value to add another flag. On the one hand, both references should behave exactly the same as far as Scripts and other parts of the engine are concerned, so different C4V_Types feel somewhat wrong. And triggering proplist entry creation through conversion from proplist reference to ordinary reference would be wrong if the reference is not used for writing. On the other hand, both references have totally different representations, and a simple mapping from Type to active member of the union is preferable.

By PeterW

Date 2010-03-24 13:12

Hm, I guess you are right. This kind of thing would be required for a correct implementation.

On the other hand, it got me thinking: Do we really need references anyway? Couldn't we just make "expr[expr] = expr" and "variable = expr" the grammar rules, and code everything in there? We would need to kill:
* Local/Global/VarNamed - Should be deprecated
* Reference parameters - I guess returning arrays is the cleaner solution anyway
* Reference return values - I think I'm about the only one that actually used that feature
* "(bla = x) = y" - yeah, like anyone actually does this kind of nonsense.

On the plus side, we might get a lot of simplification on the engine side, plus maybe even some speed. Am I overlooking something?

By Günther [de]

Date 2010-03-24 15:35

I had also thought that this might be the right direction, at least for proplist changes. I chose to not do this because of the amount of rewrite it would need. But hey, I've still got half a Clonkmeeting to do this and it would shrink C4Value by one field, so...

In the meantime, I've pushed an implementation of "proplist references", using a new C4V_Type, because ck was for that choice. But the amount of bugs I wrote into it on the first try scares me a bit.

By Zapper

Date 2010-03-24 20:44

>* Reference return values - I think I'm about the only one that actually used that feature

I used that, too!

By Günther [de]

Date 2010-03-24 21:25

Hm, I think a lot of the downsides of references could be avoided by just removing reference parameters. That way, the parser can avoid creating references which are never going to be used as references. It would only need to create them as operator parameters and for return statements of reference-returning-functions. I think this would remove the need for the array element reference counting, if we change the evaluation order for =: If in "a[0]=a" the "a[0]" would be evaluated after the "a", the array would have two references, triggering a copy. And proplist references could always ensure that the property is in the used proplist, without worry of creating properties as a side effect of reading. Okay, almost, but people who return a reference to a property without then writing to it shouldn't be surprised if that creates a property.

This wouldn't remove the need to save pointers to the references to a C4Value in that C4Value, but it might reduce the list to at most one element.

By Günther [de]

Date 2010-03-24 22:20

On the other hand, using reference parameters is the only way a function can change a big array efficiently - requiring the calling function to get rid of it's pointer to the array before calling the function is just too awkward.

By Günther [de]

Date 2010-03-26 13:19

> Couldn't we just make "expr[expr] = expr" and "variable = expr" the grammar rules, and code everything in there?

Well, we could, but it'd require significant changes to the parser. Consider (foo)=bar. At the point where the parser sees the =, the bytecode for foo has already been written. It might be possible to delete that bytecode, write the bytecode for the second expression, and then add the write-to-variable bytecode. But that's gross. So, we'd have to switch to a proper AST first.

Also, += and friends would have to change to read+modify+write bytecodes to avoid having 78 bytecodes for this (13 operators * 6 destination types). Or we'd need to change the bytecodes to encode the destination in the bytecode, instead of always pushing it onto the stack.

And I've got another idea: If we impose the rule that all functions with the same name have the same type - reference or no reference - of parameters and return values, the parser would know whether a certain function parameter needs a reference. That way, we wouldn't need special proplist references, because no reference would be created that the scripter didn't intend.

By PeterW

Date 2010-03-26 16:08 Edited 2010-03-26 16:26

> Consider (foo)=bar.

Can't we just forbid that? I would have - that's why I wrote "varname" above, and not "expr".

> Also, += and friends would have to change to read+modify+write bytecodes

You mean unsugaring "x += a" to "x = x + a" for the bytecode? Seems like the cleanest solution to me anyway.

Concerning bytecode (edited, after a bit of thinking): Okay, first we do have to reverse the execution order in any case. So if someone intends to do "bla()["bla"] = blub()", bla must be executed after blub. Otherwise we might have to re-code the whole reference holding and updating stuff inside the bytecode interpreter.

So I guess the best solution would be to have bytecode producing values (onto the stack) and bytecode producing references (into a local variable in the bytecode interpreter). The second one would be used for "a = b", in the style of "(b bytecode) AB_LOCALN_R AB_SET". The local variable would be immediately read by the next bytecode, which is fast enough that the reference is still guaranteed to be valid.

> If we impose the rule that all functions with the same name have the same type

That doesn't really sound very tempting. Do we want to argue that functions using reference parameters are somehow likely to have unique names?

I do see your point with array references being required, however. Maybe change arrays to reference sematics (yeah, icky, I know) or require that people use

var bla = { a = [...huuuge array...] };
ChangeHuuuugeAray(bla);

? ;)

By Günther [de]

Date 2010-03-26 17:36

>> Consider (foo)=bar.
> Can't we just forbid that? I would have - that's why I wrote "varname" above, and not "expr".

Hm, it would certainly reduce orthogonality from the language, but yeah, it's not especially useful.

> bla()["bla"] = blub()

I'd have done this with an AB_ARRAYSET bytecode which takes three parameters - array, key, and value, which could be produced in any order.

> Maybe change arrays to reference sematics (yeah, icky, I know)

Well, I've since lost my reluctance to change the Objects.txt, so this is now purely a language design issue. Do you know any language which doesn't use reference semantics for arrays besides C4Script?

> var bla = { a = [...huuuge array...] };
> ChangeHuuuugeAray(bla);

There's also the drop-the-reference variant:

var a = [...huuuge array...];
a = ChangeHuuuugeArray(a, a=0);

A proper optimizing compiler could even generate this automatically. SSA would make it trivial, I think. We're far away from that, unfortunately.

In any case, I think I'll concentrate my efforts on rewriting the parser using an AST. Even if we keep references this will enable other improvements.

By Randrian [de]

Date 2010-04-01 16:13

At the moment proplist seem to be broken: proplist_variable["key"] overwrites proplist with the value found in "key", even if the value is just read not written.

By Günther [de]

Date 2010-04-04 01:28

Fixed, thanks.

By Günther [de]

Date 2010-03-24 16:06 Edited 2010-03-24 16:15

This is for scripters. If you don't know what a option means, you're probably not using it ;-)

By PeterW

Date 2010-03-24 17:28 Edited 2010-03-24 17:32

Is that really the right question to ask? Some of those must have been in use simply because there was no alternatives. Now there are arrays and proplists. So it would be more interesting to know how those features got used.

Oh, and by the way: I don't see any reason why "a = (b = c)" should stop working - at least not if we don't want to kick other constructs like "a == (b = c)" as well.

By Günther [de]

Date 2010-03-24 19:43

I just listed every use of references I could think of. If some of these don't get lots of answers, we would know that we wouldn't need to provide a convenient replacement. I think reference return values and use of = to assign values to normal variables are the only ones that do not have a good replacement at the moment.

By Günther [de]

Date 2010-03-24 19:45

Bah, merging topics silently destroys polls. Well, you're probably right anyway. Anybody want to share your inventive use of functions returning references?

Topic Development / Developer's Corner / Proplist references

Post