> On Jun 28, 2024, at 3:07 AM, Michael Morris <[email protected]> wrote:
> On Thu, Jun 27, 2024 at 8:16 PM Mike Schinkel <[email protected] <mailto:[email protected]>> wrote:
> node_modules IMO is one of the worse things about the JavaScript ecosystem. Who has not seen
> the meme about node_modules being worse than a black hole?
>
> Fair enough. Or maybe import maps would be a better way forward.
Import maps are really a small part of what PHP actually needs. For example, is it a class, an
interface, or a function? For a module, it is a property?
I envision basically that this file, whatever it would be called would be a pre-compilation of
everything that PHP can pre-compile about the files that are contained within the module/directory.
See below where I talk about a pre-compiled .php.module
> But ensuring that it is possible to disallow loading needs to be contemplated in the design.
> PHP has to be able to know what is a module and what isn't without expensive processes.
>
> One possible solution is that if modules do not have <?php ?> tags, ever, and someone
> directly tries to load a module through http(s) the file won't execute. Only files with
> <?php ?> tags are executable by the web sapi.
Except that would require parsing all of the entire files in the directory to know (unless
everything were pre-compiled as I am advocating.).
Still, I think it would be better to be explicit, and for that I would propose the first line in the
file needs to start with "module" and have the name of the module.
> I've only touched the surface on how GoLang does things. Some of it was confusing to me at
> first. It's also been awhile so I'd need to refresh my memory to speak to it.
In Go modules or, in this context more correctly named "packages" are:
- A collection of files grouped into a directory and thus all files in that directory are in the
same package.
- Public or private scope are determined by case of symbols; lowercase are private and uppercase are
public. People coming from other languages tend to hate this, but I have come to love it because it
makes code less dense while employing the same information as a "public" and
"private" keywords. It also makes code across different developers more consistent.
- Packages can be nested in package directories, but...
- There is no concept of a "sub" package, meaning there are no hierarchies when packages
are used in code (there is a file path hierarchy but that is only relevant for importing the
package.) When I started working with Go I thought that was unfortunate. Now after 5+ years working
with Go I see it as a really good decision.
- Package files must have a "package" statement at the top, and all files in the directory
must have the same "package" statement, with one caveat.
- That caveat is that package files can have package <packagename>_test
as a
package name and that file is assumed to contains a test but it cannot see private members in
package <packagename>
.
- Test files are typically named to pair with a <filename>.go
and would be named
<filename>_test.go
. That file's package name can either be just
<packagename>
or <packagename>_test
, depending on if you want
to reach into private members or not.
- You can also find test
packages that contain all
<filename>_test.go
files.
- Testing is build into Go with go test ./...
to run all tests in current and all
subdirectories. (Idiomatic testing in Go is so much easier that idiomatic testing in PHP resulting
in a culture of testing among almost all Go developers.)
- Package files can have type
s, var
s, const
s, and
func
s as well as imports and directives, of course.
- Types in Go can be struct
(which is the closest Go has to a class), slice of type
e.g. []<type>
, array of type e.g. [<n>]<type>
,
map[<key>]<value>
, and a few more that I won't go into as I think they
are out of scope for this explanation.
- Packages can have one or more init()
functions that are all called before the
program's main()
func is called. There can also be multiple init()
functions even in the same file.
- var
s can be initialized and those initializations are run before the program's
main()
func is called.
- const
s are initialized before the program's main()
func is called
but can only be initialized by literal scalar types. Unfortunately.
- import
s take the form of import "<package>"
for standard
library types and where a <package>
can contain parent paths.
- For local types import
s take the form of import
"<module>/<package>"
where a <module>
is defining by
having a go.mod
file in the directory or a parent directory, and a
<package>
can contain parent paths. A go.mod
file has a
module
directive, a go
version, and one or more require
statements (I'm ignoring a bit of minutia here.)
- Modules allow grouping of packages together and were added in recent years to provide versioning
for the collective dependencies of a module. The version information is stored in
go.sum
and is managed automatically with Go CLI commands.
- For external third party modules import
s take the form of import
"<domain>/<package>"
where <package>
can contain
parent paths and almost always does. An example is "github.com/stretchr/testify/assert <http://github.com/stretchr/testify/assert>"
- External modules are by definition HTTP(S) GETable, and Go developers use go get
<module>
on the command line to download the module. Go does not have or need a 3rd
party package manager as that can become a single point of failure and is definitely a single point
of control. To download testify
for use in their Go module a Go dev would run go
get github.com/stretchr/testify <http://github.com/stretchr/testify/assert>
- Most external third party modules for Go are hosted on Github but can be hosted on a custom
domain, Bitbucket, GitLab, etc.
- The Go team manages a standard proxy for go get
but organizations can run their own
if desired.
- import
s are referenced by name internally where the package name is the last segment
of the import after a /
, or just the name if no slash. So
"github.com/stretchr/testify/assert <http://github.com/stretchr/testify/assert>"
is referenced in code as assert
. For example, assert.Equal(t,1,value)
would assert that value!=1
then it would use the testing variable t
to
mark this assertion as an error and generate appropriate output.
- import
s can be aliases so you could import check
"github.com/stretchr/testify/assert <http://github.com/stretchr/testify/assert>"
and then call check.Equal(t,1,value) instead of
assert.Equal(t,1,value)` but needing to
alias a package frequently is a code smell for a badly named package.
- You can use .
as an alias and then not need to use the alias, so we could
import
"github.com/stretchr/testify/assert <http://github.com/stretchr/testify/assert>"
and then just call
Equal(t,1,value) instead of assert.Equal(t,1,value)
but this
is frowned on in the Go community except for in very specific use-cares.
- You can use _
to bring in a package even if you are not referencing it in case it has
an init()
function that you need to run. If that applied to testify it would look like
this: import _
"github.com/stretchr/testify/assert <http://github.com/stretchr/testify/assert>".`
- All of import
, var
, and const
support a multiline for using
parenthesis like so:
var (
x = 1
y = 2
)
- Module names are idiomatically one word w/o underscores and lowercase.
- There is no need to import specific symbols from a Go package like there is in JavaScript. I have
programmed in both Go and JS, and I have not found a real benefit to having to reference everything
explicitly in the import — since you have to mention the package name everywhere you use any
package symbol — but I have noticed a benefit to not having nearly as much boilerplate at the top
of the file for import when working with Go vs. working with Javascript. And my GoLang IDE just
manages imports for me whereas WebStorm just calls out when I haven't imported function names
in Javascript.
I am sure there is more I missed, but that should cover the highlights.
The takeaways that I think would be useful are PHP modules are:
1. Imports
2. Import aliases
3. Module-level consts
4. Module-level init() functions
5. Module-level vars with initialization
6. Module-level functions
7. One directory == one module
8. No hierarchy for modules
9. Single word module names in lowercase.
10. Module sytax being <module><operator><symbol>, e.g. mymodule->MySymbol
Takeaways I wish the PHP community would consider but doubt there is any chance:
1. Having modules be HTTP(S) GETtable with php get <module>
2. Uppercase being public, lowercase being private, and no need for protected
3. Test packages with testing build into PHP e.g. php test ./...
> >> I'm not fond of this either.
> >
> > There will need to be a way to define the entrypoint php. I think index.php is
> > reasonable, and if another entry point is desired it can be called out ->
> > "mypackage/myentry.php"
>
> Why is an entry point needed? If there is a module metadata file as I am proposing PHP can get
> all the information it needs from that file. Maybe that is the .phm
file?
>
> Maybe. Again, I need to look over this meta data format. Also, how does it get created?
As I am envisioning, PHP at the command line would have the ability to pre-compile a module — aka
all files in the module directory — and then write a module-specific file, maybe
.php.module
? That could ideally be optimized for loading by PHP and have everything it
needs to know to run the code in that module.
That file could be completely self-contained include all source code similar to a .phar
file, or it could just have a complete symbol table and still require the PHP source code to exist
as well. I have not pondered all the pros and cons of these alternatives yet.
Clearly though even if it compiled to a self-contained file the .PHP files would still be needed
during development. Thus I envision that the PHP CLI would need a --watch
option to
watch directories and recompile the .php.module
file upon PHP file change. IDEs like
PhpStorm could run php --watch
for users and non-IDE users could run it themselves.
When PHP would come across an import
statement pointing to a module directory it would
first look for the compiled .php.module
file and if found use it but if not found it
would recreate it. Maybe it could write to disk, or generate an error if it cannot write to disk.
OTOH writing to disk might be a security issue in which case it could issue a warning that the
.php.module
file does not exists and then compile the module to memory and continue on.
It would be nice if there was a mode where PHP would check the timestamps of all PHP files in the
module directory and if the compiled .php.module
was earlier than any of the .php file
then recompile but you'd want that off for production. That could be a new function
set_dev_mode(boolean)
or a CLI option to create a .phpdev.module
instead
of a .php.module
.
Clearly anyone using deployments could have their build generate all the required
.php.module
files for deployment, and hosting companies that host apps that don't
use deployments like WordPress could have processes that build the .php.module
files
for their users.
I think I have thought through this enough to identify there are no technical blockers, but I could
certainly have missed something so please call it out if anyone can identify something that would
keep this from working and/or significantly change the nature of PHP development.
BTW, this pre-compiling would ONLY apply to modules, so people not using modules would not have to
be concerned about any of this at all.
-Mike
P.S.
> I remember when the choice to use \ was made. I've rarely been so angry about a language
> design choice before or since. I've gotten used to it, but seeing \\ all over the place in
> strings is still.. yuck.
Ditto.