A Sense of Doubt blog post #1250 - Microsoft Improving Windows Command Line
I know, I know, I should be using Unix for all my command line folderol. But even though my laptop dual boots (using GRUB) to Linux in one partition and Windows in the other, all my work is in Windows, and so I find it's much easier to use the Windows Command Line console to do some coding, as seen in the top image from my own work with Python.
I spotted Slashdot's story on Microsoft overhauling the command line, and I was intrigued to read the blog posts. When I found that they provided not only an overview of Windows console but a history that included Teletype and Ms-Dos, I knew I wanted to the feature the content here.
Remember, this is my study more so (often) than my teaching.
If coding is your gig, enjoy.
I don't have permission to re-post any of this content, but I am giving due credit to the original sources.
Microsoft Is Making the Windows Command Line a Lot Better (arstechnica.com)
Posted by BeauHD from the new-and-improved dept.An anonymous reader quotes a report from Ars Technica:Over the last few years, Microsoft has been working to improve the Windows console. Console windows now maximize properly, for example. In the olden days, hitting maximize would make the window taller but not wider. Today, the action will fill the whole screen, just like any other window. Especially motivated by the Windows subsystem for Linux, the console in Windows 10 supports 16 million colors and VT escape sequences, enabling much richer console output than has traditionally been possible on Windows.
Microsoft is working to build a better console for Windows, one that we hope will open the door to the same flexibility and capabilities that Unix users have enjoyed for more than 40 years. The APIs seem to be in the latest Windows 10 Insider builds, though documentation is a little scarce for now. The command-line team is publishing a series of blog posts describing the history of the Windows command-line, and how the operating system's console works. The big reveal of the new API is coming soon, and with this, Windows should finally be able to have reliable, effective tabbed consoles, with emoji support, rich Unicode, and all the other things that the Windows console doesn't do... yet.
FROM -
https://blogs.msdn.microsoft.com/commandline/2018/06/20/windows-command-line-backgrounder/
Windows Command-Line: Backgrounder
★★★★★
★★★★
★★★
★★
★
This is the first of a series of posts in which we’ll explore all things command-line – from the origins of the command-line and the evolution of the terminal, to what we’re doing to overhaul and modernize the Windows Console & command-line in future Windows releases.
Posts in this series (will be updated as more posts are published):
- Backgrounder (this post)
- The Evolution of the Windows Command-Line
- Inside the Windows Console
Whether you’re a seasoned veteran, or are new to computing (welcome all), we hope you’ll find these posts interesting, informative, and fun. So, grab a coffee and settle in for a whirlwind tour through the origins of the command-line!
A long time ago in a server room far, far away …
From the earliest days of electronic computing, human users needed an efficient way to send commands and data to the computer, and to be able to see the results of their commands/calculations.
One of the first truly effective human-computer interfaces was the Tele-Typewriter – or “Teletype”. Teletypes were electromechanical machines with keyboards for user input, and an output device of some kind – printers in the early days, screens in more recent devices – which displayed output to the user.
The characters that the operator typed were buffered locally and sent from the Teletype to a nearby mini or mainframe computer as a series of signals along an electrical cable (e.g. RS-232 cable) at 10 characters per second (110 baud/bits per second – bps):
Note: David Gesswein’s awesome PDP-8 site has a lot more information on the ASR33 (and the PDP-8 and associated tech), including photos, videos, etc.
The program running on the computer would receive the typed characters, decide what to do with them, and might optionally, asynchronously send characters back to the Teletype. The Teletype would print /display the returned characters for the operator to read and respond to.
In the years that followed, the technology improved, boosting transmission speeds up to 19,200bps, and replacing the noisy and expensive-to-operate printer with a Cathode Ray Tube (CRT) display most often associated with computer terminals of the ‘80s and ‘90s, including the ubiquitous DEC VT100 terminal:
While the technology improved, this model of the terminal sending characters to programs running on the computer, and the computer responding with text output to the user, remained, and remains today, as the fundamental interaction model for all command-lines & terminals on all platforms!
Part of the elegance of this model is the fact that each component part of the system remained simple and consistent: The keyboard emitted characters which were buffered for output as electrical signals to the connected computer. The output device simply wrote the characters emitted by the connected computer onto the display technology (e.g. paper/screen).
And because each stage of the system communicated with the next stage by simply passing streams of characters, it is a relatively simple process to introduce different communications infrastructure, adding, for example, modems which allow streams of input and output characters to be sent over great distances via telephone lines.
Text Encoding
It’s important to remember that terminals and computers communicate via streams of characters: When a key on the terminal’s keyboard is pressed, a value representing the typed character is sent to the connected computer. Press the ‘A’ key and the value 65 (0x40) is sent. Press the ‘Z’ key and the value 90 (0x5a) is sent.
7-bit ASCII Text Encoding
The list of characters and their values is defined in the American Standard Code for Information Interchange (ASCII) standard (ISO/IEC 646/ ECMA-6) “7-bit coded character set” which defines
- The 128 values that represent the printable Latin A-Z (65-90), a-z (97-122), 0-9 (48-57)
- Many common punctuation characters
- Several non-displayable device control codes (0-31 & 127):
When 7 bits aren’t enough – Code-Pages
However, 7 bits do not provide enough space to encode many diacritics, punctuation, and symbols used in other languages and regions. So, with the addition of an additional bit, the ASCII character table can be extended with additional sets of “Code-Pages” that define characters 128-255 (and may re-define several other non-printable ASCII characters).
For example, IBM defined code-page 437 which added several block characters like ╫ (215) and ╣(185) and symbols including π (227) and ± (241), and redefined printable characters for the normally non-printable characters 1-31:
Code-Page 437
The Latin-1 code-page defines many characters and symbols used by Latin-based languages:
Many command-line environments and shells allow the user to change code-pages, which causes the terminal to display different characters (depending on the available fonts), especially for characters with a value of 128-255. However, note that the wrong code-page can cause the displayed text to look “mojibaked”. And, yes, “mojibake” is a real term! Who knew?
When 8 bits aren’t enough - Unicode
While code-pages provided a solution for a while, they have many shortcomings, including the fact that they do not allow text for multiple code-pages/languages to be displayed at the same time. So, a new encoding was required that would allow the accurate representation of every character and script of every language known to man, with plenty of room to spare!
Enter, Unicode.
Enter, Unicode.
Unicode is an international-standard (ISO/IEC 10646) that (currently) defines 137,439 characters covering 146 modern and historic scripts, plus many symbols and glyphs including the many emoji in widespread use across practically every app, platform, and device The Unicode standard is regularly updated, adding additional writing systems, adding/correcting emoji symbols, etc.
Unicode also defines “non-printable” formatting characters that allow, for example, characters to be conjoined and/or affect the preceding or subsequent characters! This is particularly useful in scripted languages like Arabic wherein a given character’s ligatures are determined by the characters to which it is surrounded! Emoji also use “zero width joiner” to combine several characters into one visual glyph; for example, Microsoft's Ninja Cat emoji are formed by joining the cat emoji with other emoji to render ninja-cat emoji:
When many-bytes are too many – UTF-8!
The space required to represent all the symbols defined by Unicode, especially complex characters, emoji, etc. could be very large and may require several bytes to uniquely and systematically define every displayable character.
Thus, several encodings have been developed that trade storage space vs. time/effort required to encode/decode the data: UTF-32 (4 bytes / char), UTF-16/UCS-2 (2 bytes / char), and UTF-8 (1-4 bytes / char) are among the most popular Unicode encodings.
Thanks in large part to its backward-compatibility with ASCII and its storage efficiency, UTF-8 has emerged as the most popular Unicode encoding on the internet, and has seen explosive adoption ever since 2008 when it overtook ASCII and other popular encodings:
So, while most terminals started by supporting 7-bit and then 8-bit ANSI text, most modern terminals support Unicode/UTF-8 text.
So, what IS a Command Line, and what is a Shell?
The “Command-Line” or CLI (Command Line Interface/Interpreter) describes the most fundamental mechanism through which a human operates a computer: A CLI accepts input typed-in by the operator and performs the requested commands.
For example,
echo Hello
writes the text “Hello” to the output device (e.g. screen). dir
(Cmd) or ls
(PowerShell/*NIX) lists the contents of the current directory, etc.
In earlier computers, the commands available to the operator were often relatively simple, but operators quickly demanded more and more sophistication, and the ability to write scripts to automate mundane or repetitive, or complex tasks. Thus command-line processors grew in sophistication and evolved into what are now commonly known as command-line “shells”.
In UNIX/Linux the original UNIX shell (sh) inspired a plethora of shells including the Korn shell (ksh), C shell (csh) and Bourne Shell (sh), which itself begat the Bourne Again Shell (bash), etc.
In Microsoft’s world:
- The original MS-DOS (command.com) was a relatively simple (if quirky) command-line shell
- Windows NT’s “Command Prompt” (cmd.exe) was designed to be compatible with legacy MS-DOS command.com/batch scripts, and added several additional commands for the new, more powerful operating system
- In 2006, Microsoft released Windows PowerShell
- PowerShell is a modern object-based command-line shell inspired by the features of other shells, and was built upon and incorporates the power of the .NET CLR & .NET Framework
- Using PowerShell, Windows users can control, script, and automate practically every aspect of a Windows machine, group of Windows machines, network, storage systems, databases, etc.
- In 2017, Microsoft open-sourced PowerShell and enabled it to run on macOS and many flavors of Linux and BSD!
- In 2016, Microsoft introduced Windows Subsystem for Linux (WSL)
- Enables genuine unmodified Linux binaries to run directly on Windows 10
- Users install one or more genuine Linux distros from the Windows Store
- Users can run one or more distro instances alongside one another and existing Windows applications and tools
- WSL enables Windows users to run all their favorite Windows tools and Linux command-line tools side-by-side without having to dual-boot or utilize resource-hungry Virtual Machines (VM’s)
We’ll revisit Windows command-line shells in the future, but for now know that there are various shells, and they accept commands typed by the user/operator, and perform a wide variety of tasks as required.
The Modern Command-Line
Modern-day computers are vastly more powerful than the “dumb terminals” of yesteryear and generally run a desktop Operating System (e.g. Windows, Linux, macOS) sporting a Graphical User Interface (GUI). These GUI environments allow multiple applications to run simultaneously within their own “window” on the user’s screen, and/or invisibly in the background.
The clunky, hulking electromechanical Teletype machines have been replaced with modern terminal applications that run within an on-screen window, but still perform the same essential functions as the terminal devices from the past.
Similarly, command-line applications, to which terminal apps are connected, work in the same way that they always did: They receive input characters, decide what to do with those characters, (optionally) do work, and may emit text to be displayed to the user. But instead of communicating via slow TTY serial communications lines, terminal apps and command-line applications on the same machine communicate via very high-speed, in-memory Pseudo Teletype (PTY) communications.
Of course, while modern terminals primarily communicate with command-line applications running locally, they can also communicate with command-line applications running on other machines on the same network, or even remote machines running on the other side of the world via the internet. This “remoting” of the command-line experience is a powerful tool which is popular on every platform, especially *NIX platforms.
So, where are we?
In this post, we took a historical tour through the most important aspects of the command-line that are common to both *NIX and Windows: Terminals, Shells, Text & text encoding.
It will be important to remember the information above as we continue to our next post where we’ll learn more about the Windows Console, what it is, how it works, how it differs from *NIX terminals, where it has challenges, and what we’re doing to remedy these challenges, and bring the Windows Console into the 21st Century!
Stay Tuned - more to come!!
https://blogs.msdn.microsoft.com/commandline/2018/06/27/windows-command-line-the-evolution-of-the-windows-command-line/
Windows Command-Line: The Evolution of the Windows Command-Line
★★★★★
★★★★
★★★
★★
★
Welcome to the second post in this "Windows Command-Line" series. In this post we'll discuss some of the background & history behind the Windows Command-Line. Specifically, we’ll explore its humble origins in MS-DOS, to its modern-day incarnation supporting tools like PowerShell and Windows Subsystem for Linux.
Posts in this series:
- Command-Line Backgrounder
- The evolution of the Windows Command-Line (this post)
- Inside the Windows Console
In this series' previous post, we discussed the history and fundamentals of the Command-Line and saw how the architecture of Command-Lines in general has remained largely consistent over time, even while terminals evolved from electromechanical teletypes through to modern terminal applications.
Our journey now continues along a rather tangled path, starting with early PC's, winding through Microsoft's involvement with several Operating Systems, to the newly reinvigorated Command-Line of today:
From humble beginnings - MS-DOS
Back in the early years of the PC industry, most computers were operated entirely by typing commands into the command-line. Machines based on Unix, CP/M, DR-DOS, and others tussled for position and market share. Ultimately, MS-DOS rose to prominence as the de-facto standard OS for IBM PC's & compatibles, especially in businesses:
Like most mainstream Operating Systems of the time, Microsoft's MS-DOS' "Command-Line Interpreter" or "shell" provided a simple, quirky, but relatively effective set of commands, and a command-scripting syntax for writing batch (.bat) files.
MS-DOS was very rapidly adopted by businesses large and small, that, combined, created many millions of batch scripts, some of which are still in use today! Batch scripts are used to automate the configuration of users' machines, setting/changing security settings, updating software, building code, etc.
You may never/rarely see batch or command-line scripts running since many are executed in the background while, for example, logging into a work PC. But hundreds of billions of command-line scripts and commands are executed every day on Windows alone!
While the Command-Line is a powerful tool in the hands of those with the patience and tenacity to learn how to make the most of the available commands and tools, most non-technical users struggled to use their Command-Line driven computers effectively, and most disliked having to learn and remember many seemingly arcane/abbreviated commands to make their computers do anything useful.
A more user-friendly, productivity-oriented user experience was required.
The GUI goes mainstream
Many competing GUI's emerged rapidly in the Apple Lisa and Macintosh, Commodore Amiga (Workbench), Atari ST (DRI's GEM), Acorn Archimedes (Arthur/RISC OS), Sun Workstation, X11/X Windows, and many others, including Microsoft Windows:
Windows 1.0 arrived in 1985, and was basically an MS-DOS application that provided a simple tiled-window GUI environment, allowing users to run several applications side-by-side:
Windows 2.x, 3.x, 95, and 98, all ran atop an MS-DOS foundation. While later versions of Windows began to replace features previously provided by MS-DOS with Windows-specific alternatives (e.g. file-system operations), they all relied upon their MS-DOS foundations.
Note: Windows ME (Millennium Edition) was an interesting chimera! It finally replaced the MS-DOS underpinnings and real-mode support of previous versions of Windows with several new features (esp. Gaming & Media tech). Some features were incorporated from Windows 2000 (e.g. new TCP/IP stack), but tuned to run on home PC's that might struggle to run full NT. This story might end up being an interesting post in and of itself someday! (Thanks Bees for your thoughts on this :))
However, Microsoft knew that they could only stretch the architecture and capabilities of MS-DOS and Windows so far: Microsoft knew it needed a new Operating System upon which to build their future.
Microsoft - Unix Market Leader! Yes, seriously!
While developing MS-DOS, Microsoft was also busy delivering Xenix - Microsoft's port of Unix version 7 - to a variety of processor and machine architectures including the Z8000, 8086/80286, and 68000.
By 1984, Microsoft's Xenix had become the world's most popular Unix variant!
However, the US Government's breakup of Bell Labs - home of Unix - resulted in the spin-off of AT&T which started selling Unix System V to computer manufacturers and end-users.
Microsoft felt that without their own OS, their ability to achieve their future goals would be compromised. This led to the decision to transition away from Xenix: In 1987, Microsoft transferred ownership of Xenix to its partner The Santa Cruz Operation (SCO) with whom Microsoft had worked on several projects to port and enhance Xenix on various platforms.
Microsoft + IBM == OS/2 … briefly
In 1985, Microsoft began working with IBM on a new Operating System called OS/2. OS/2 was originally designed to be "a more capable DOS" and was designed to take advantage of some of the modern 32-bit processors and other technology rapidly emerging from OEM's including IBM.
However, the story of OS/2 was tumultuous at best. In 1990 Microsoft and IBM ended their collaboration. This was due to a number of factors, including significant cultural differences between the IBM and Microsoft developers, scheduling challenges, and the explosive success and growth in adoption of Windows 3.1. IBM continued development & support of OS/2 until the end of 2006.
By 1988 Microsoft was convinced that its future success required a bigger, bolder and more ambitious approach. Such an approach would require a new, modern Operating System which would support the company's ambitious goals.
Microsoft's Big Bet - Windows NT
In 1988, Microsoft hired Dave Cutler, creator of DEC's popular and much respected VAX/VMS Operating System. Cutler's goal - to create a new, modern, platform-independent Operating System that Microsoft would own, control, and would base much of its future upon.
That new Operating System became Windows NT - the foundation that evolved into Windows 2000, Windows XP, Windows Vista, Windows 7, Windows 8, and Windows 10, as well as all versions of Windows Server, Windows Phone 7+, Xbox, and HoloLens!
Windows NT was designed from the start to be platform independent, having initially been built to support Intel's i860, then the MIPS R3000, Intel 80386+, DEC Alpha, and PowerPC. Since then, the Windows NT OS family has been ported to support the IA64 "Itanium", x64, and ARM / ARM64 processor architectures, among others.
Windows NT provided a Command-Line interface via its "Windows Console" terminal app, and the "Command Prompt" shell (cmd.exe). Cmd was designed to be as compatible as possible with MS-DOS batch scripts, to help ease business' adoption of the new platform.
The Power of PowerShell
While Cmd remains in Windows to this day (and will likely do so for many decades to come), because its primary purpose is to remain as backward-compatible as possible, Cmd is rarely improved. Even "fixing bugs" is sometimes difficult if those "bugs" existed in MS-DOS or earlier versions of Windows!
In the early 2000's, the Cmd shell was already running out of steam, and Microsoft and its customers were in urgent need of a more powerful and flexible Command-Line experience. This need fueled the creation of PowerShell (which originated from Jeffrey Snover's "The Monad Manifesto").
PowerShell is an object-oriented Shell, unlike the file/stream-based shells typically found in the *NIX world: Rather than handling streams of text, PowerShell processes streams of objects, giving PowerShell script writers the ability to directly access and manipulate objects and their properties, rather than having to write and maintain a lot of script to parse and manipulate text (e.g. via sed/grep/awk/lex/etc.)
Built atop the .NET Framework and Common Language Runtime (CLR), PowerShell's language & syntax were designed to combine the richness of the .NET ecosystem, with many of the most common and useful features from a variety of other shells scripting languages, with a focus on ensuring scripts are highly consistent, and extremely ... well ... powerful
To learn more about PowerShell, I recommend reading "PowerShell In Action" (Manning Press), written by Bruce Payette - the designer of the PowerShell syntax and language. The first few chapters in particular provide an illuminating discussion of the language design rationale.
PowerShell has been adopted by many Microsoft platform technologies, and partners, including Windows, Exchange Server, SQL Server, Azure and many others, and provides commands to administer, and control practically every aspect of a Windows machine and/or environment in a highly consistent manner.
PowerShell Core is the open-source future of PowerShell, and is available for Windows and various flavors of Linux, BSD, and macOS!
POSIX on NT, Interix, and Services For UNIX
When designing NT, Cutler & team specifically designed the NT kernel and OS to support multiple subsystems - interfaces between user-mode code, and the underlying kernel.
When Windows NT 3.1 first shipped in 1993, it supported several subsystems: MS-DOS, Windows, OS/2 v1.3, and POSIX v1.2. These subsystems allowed NT to run applications targeting several Operating System platforms upon the same machine and base OS, without virtualization or emulation - a formidable capability even today!
While Windows NT's original POSIX implementation was acceptable, it required significant improvements to make it truly capable, so Microsoft acquired Softway Systems and its "Interix" POSIX-compliant NT subsystem.
For the fascinating inside story on the origins, growth, and acquisition of Interix, read Stephen Walli's two-part story here: Part 1, and Part 2. For more technical details behind Interix and how it integrated into Windows, read Stephen's USENIX paper titled "INTERIX : UNIX Application Portability to Windows NT via an Alternative Environment Subsystem".
Interix was originally shipped as a separate add-on, and then later combined with several useful utilities and tools, and released as "Services For Unix" (SFU) in Windows Server 2003 R2, and Windows Vista. However, SFU was discontinued after Windows 8.
And then a funny thing happened...
Windows 10 - a new era for the Windows command-line!
Early in Windows 10's development, Microsoft opened up a UserVoice page, asking the community what features they wanted in various areas of the OS. The developer community was particularly vociferous in its requests that Microsoft:
- Make major improvements to the Windows Console
- Give users the ability to run Linux tools on Windows
Based on that feedback, Microsoft formed two new teams:
- The Windows Console & command-line team, charged with taking ownership of, and overhauling the Windows Console & command-line infrastructure
- A team responsible for enabling genuine, unmodified Linux binaries to run on Windows 10 - the Windows Subsystem for Linux (WSL)
The rest, as they say, is history!
Windows Subsystem for Linux (WSL)
Adoption of GNU/Linux based "distributions" (combinations of the Linux kernel and collections of user-mode tools) had been growing steadily, especially on servers and in the cloud. While Windows had a POSIX compatible runtime, SFU lacked the ability to run many Linux tools and binaries because of the latter's additional System Calls and behavioral differences vs. traditional Unix/POSIX.
Due to the feedback noted received from technical Windows customers and users, along with an increasing demand inside Microsoft itself, Microsoft surveyed several options, and ultimately decided to enable Windows to run unmodified, genuine, Linux binaries!
In Mid 2014, Microsoft formed a team to work on what would become the Windows Subsystem for Linux (WSL). WSL was first announced at Build 2016, and was previewed in Windows 10 Insider builds shortly afterwards.
In most Insider builds since then, and in each major OS release since Anniversary Update in fall 2016, WSL's feature-breadth, compatibility, and stability has improved significantly: When WSL was first released, it was an interesting experiment, ran several common Linux tools, but failed to run many common developer tools/platforms. The team iterated rapidly, and with considerable help from the community (thanks all!), WSL quickly gained many new capabilities, enabling it to run increasingly sophisticated Linux binaries and workloads.
Today (mid 2018), WSL happily runs the majority of Linux binaries, tools, compilers, linkers, debuggers, etc. Many developers, IT Pro's, devops engineers, and many others who need to run or build Linux tools, apps, services, etc. enjoy dramatically improved productivity, being able to run their favorite Linux tools alongside all their favorite Windows tools, on the same machine, without needing to dual-boot.
The WSL team continues to work on improving WSL's ability to execute many Linux scenarios, and improve its performance, and integration with the Windows experience.
The Windows Console Reboot and Overhaul
In late 2014, with the project to build Windows Subsystem for Linux (WSL), in full-swing, and due to an explosion of reinvigorated interest in all things Command-Line, the Windows Console was ... well … clearly in need of some TLC, and required many improvements frequently requested by customers and users.
In particular, the Console was lacking many features expected of modern *NIX compatible systems, such as the ability to parse & render ANSI/VT sequences used extensively in the *NIX world for rendering rich, colorful text and text-based UI's.
What, then, would be the point of building WSL if the user would not be able to see and use Linux tools correctly?
Below is an example of what the Console renders in Windows 7 vs. Windows 10: Note that Windows 7's Console (left) is unable to correctly render the VT generated by the
tmux
, htop
, Midnight Commander
and cowsay
Linux tools, whereas they render correctly in Windows 10 (right):
So, in 2014, a new, small, "Windows Console Team" was formed, charged with the task of unravelling, understanding, and improving the Console code-base … which by this time was ~28 years old - older than the developers working on it!
As any developer who's ever had to adopt an old, crufty, less-than-optimally-maintained codebase will attest, modernizing old code is generally "tricky". Doing so without breaking existing behaviors is trickier still. Updating the most frequently launched executable in all of Windows without breaking millions of customers' scripts, tools, login scripts, build systems, manufacturing systems, analysis and production systems, etc. requires a great deal of "care and patience"
To compound these challenges, the team quickly came to learn how exacting customers' expectations of the Console are: For example, if the team deviate Console performance by even a percentage point or two, from one build to the next, alarms fire off in the Windows Build team, resulting in … ahem … "swift, and direct feedback" usually demanding immediate fixes.
So, when we discuss Console improvements & new features in future articles, remember that there are a few inviolate tenets against which each change is measured, including:
- DO NOT introduce/expose new security vulnerabilities
- DO NOT break existing customers (internal or external), tools, scripts, commands, etc.
- DO NOT regress performance or increase memory consumption / IO (without clear and well communicated reasons)
Over the last 3 years, the Console team has:
- Massively overhauled the Console's internals
- Dramatically simplified, and reduced the volume of code in the Console
- Replaced several internally implemented collections, lists, stacks, etc. with STL containers
- Modularized and isolated logical and functional units of code, enabling features to be improved (and on occasion replaced), without "breaking the world"
- Consolidated several previously separate and incompatible Console engines into one
- Added MANY reliability, safety, and security improvements
- Added the ability to parse and render ANSI/VT sequences, enabling the Console to accurately render rich text output from *NIX and other modern command-line tools & apps
- Enabled the Console to render 24-bit Colors, up from just 16 colors previously!
- Improved Console accessibility, enabling Narrator and other UIA apps to navigate the contents of the Console Window
- Added / improved mouse and touch support
And the work continues! We're currently wrapping up the implementation of a couple of exciting new features that we'll discuss in up-coming posts in this series.
So, where are we?
If you read this far, congratulations and thank you!
So why the history lesson?
As I hope you can understand from reading the history above, it's important to understand that the Command-Line has remained a pivotal component of Microsoft's strategy, platform, and ecosystem.
Even while Microsoft promoted the Windows GUI to end-users, Microsoft, and its technical customers/users/partners, rely heavily on the Windows Command-Line for a multitude of technical tasks.
In fact, Microsoft literally could not build Windows itself, nor any of its other software products, without a fast, efficient, stable, and secure Console!
Throughout the MS-DOS, Unix, OS/2, and Windows eras, the Command-Line has remained as perhaps the most crucial tool in every technical user's toolbox! Even the many users who rarely/never type commands into a Console themselves use the Console every day! When you build your code in Visual Studio (VS), your build is spawned in a hidden Console window! If you use Exchange Server, or SQL Server's admin tools, many of those commands are executed via PowerShell in a hidden Console!
In this post, we covered a lot of ground: We reviewed some of Microsoft's OS history as it pertains to the Command-Line and Windows Console. We also gained an understanding of the Windows Console's origins.
In the next post, we'll start digging into the technology itself. Stay tuned for more!
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
https://blogs.msdn.microsoft.com/commandline/2018/07/20/windows-command-line-inside-the-windows-console/
Windows Command-Line: Inside the Windows Console
★★★★★
★★★★
★★★
★★
★
Welcome to the third post in the Windows Command-Line series. In this post, we'll start to dig into the internals of the Windows Console and Command-Line, what it is, what it does ... and what it doesn't do.
Posts in this series:
- Command-Line Backgrounder
- The Evolution of the Windows Command-Line
- Inside the Windows Console (this post)
[Updated 2018-07-20 to improve readability and clarify some Unicode/UTF-x details]
During the initial development of Windows NT, circa 1989, there was no GUI, there was no desktop, there was ONLY a full-screen command-line, that visually resembled MS-DOS more than it did the future. When the Windows GUI’s implementation started to arrive, the team needed a Console GUI app and thus, the Windows Console was born. Windows Console is one of the first Windows NT GUI apps, and is certainly one of the oldest Windows apps still in general use.
The Windows Console code-base is currently (July 2018) almost 30 years old ... older, in fact, than the developers who now work on it!
What does the Console do?
As we learned in our previous posts, a Terminal's job is relatively simple:
- Handle User Input
- Accept input from devices including keyboard, mouse, touch, pen, etc.
- Translate input into relevant characters and/or ANSI/VT sequences
- Send characters to the connected app/tool/shell
- Handle App Output:
- Accept text output from a connected Command-Line app/tool
- Update the display as required, based on the received app output (e.g. output text, move the cursor, set text color, etc.)
- Handle System Interactions:
- Launch when requested
- Manage resources
- Resize/maximize/minimize, etc.
- Terminate when required, or when the communications channel is closed/terminated
However, the Windows Console does things a little differently:
Inside the Windows Console
Windows Console is a traditional Win32 executable and, though it was originally written in 'C', much of the code is being migrated to modern C++ as the team modernizes and modularizes Console's codebase.
For those who care about such things: Many have asked whether Windows is written in C or C++. The answer is that - despite NT's Object-Based design - like most OS', Windows is almost entirely written in 'C'. Why? C++ introduces a cost in terms of memory footprint, and code execution overhead. Even today, the hidden costs of code written in C++ can be surprising, but back in the late 1990's, when memory cost ~$60/MB (yes … $60 per MEGABYTE!), the hidden memory cost of vtables etc. was significant. In addition, the cost of virtual-method call indirection and object-dereferencing could result in very significant performance & scale penalties for C++ code at that time. While one still needs to be careful, the performance overhead of modern C++ on modern computers is much less of a concern, and is often an acceptable trade-off considering its security, readability, and maintainability benefits ... which is why we're steadily upgrading the Console’s code to modern C++.
So, what's inside the Windows Console?
Before Windows 7, Windows Console instances were hosted in the crucial Client Server Runtime Subsystem (CSRSS). In Windows 7, however, Console was extracted from CSRSS for security and reliability reasons, and given a new home in the following binaries:
- conhost.exe - the user-mode Windows Console UX & command-line plumbing
- condrv.sys - a Windows kernel driver providing communication infrastructure between conhost and one or more Command-Line shells/tools/apps
A high-level view of Console's current internal architecture looks like this:
The core components of the Console consist of the following (from the bottom-up):
- ConDrv.sys - Kernel-Mode driver
- Provides a high-performance communications channel between Console and any connected Command-Line apps
- Ferries IO Control (IOCTL) messages back and forth between Command-Line apps and the Console they're "attached" to
- Console IOCTL messages contain
- Data representing requests to execute API calls against the Console instance
- Text sent from the Console to the Command-Line app
- ConHost.exe - Win32 GUI app:
- ConHost Core - the Console's internals and plumbing
- API Server: Converts IOCTL messages received from Command-Line app(s) into API calls, and sends text records from Console to Command-Line app
- API: Implements the Win32 Console API & logic behind all the operations that the Console can be asked to perform
- Input Buffer: Stores keyboard and mouse event records generated by user input
- VT Parser: If enabled, parses VT sequences from text, extracts any found from text, and generates equivalent API calls instead
- Output Buffer: Stores the text displayed on the Console’s display. Essentially a 2D array of CHAR_INFO structs which contain each cell's character data & attributes (more on the buffer below)
- Other: Not included in the diagram above include settings infrastructure storing/retrieving values from registry and/or shortcut files etc.
- Console UX App Services - the Console UX & UI layer
- Manages the layout, size, position, etc. of the Console window on-screen
- Displays and handles settings UI, etc.
- Pumps the Windows message queue, handles Windows messages, and translates user input into key and mouse event records, storing them in the Input Buffer
- ConHost Core - the Console's internals and plumbing
The Windows Console API
As can be seen in the Console architecture above, unlike NIX terminals, the Console sends/receives API calls and/or data serialized into IO Control (IOCTL) messages, not serialized text. Even ANSI/VT sequences embedded in text received from (primarily Linux) Command-Line apps is extracted, parsed and converted into API calls. This difference exposes the key fundamental philosophical difference between *NIX and Windows: In *NIX, "everything is a file", whereas, in Windows, "everything is an object".
There are pros and cons to both approaches, which we'll outline, but avoid debating at length here. Just remember that this key difference in philosophy is fundamental to many of the differences between Windows and *NIX!
In *NIX, Everything is a File
When Unix was first implemented in the late 1960's and early 1970's, one of the core tenets was that (wherever possible) everything should be abstracted as a file stream. One of the key goals was to simplify the code required to access devices and peripherals: If all devices presented themselves to the OS as file-systems, then existing code could access those devices more easily. This philosophy runs deep: One can even navigate and interrogate a great deal of a *NIX-based OS & machine configuration by navigating pseudo/virtual file-systems which expose what appear to be "files" and folders, but actually represent machine configuration, and hardware. For example, in Linux, one can explore a machine's processors' properties by examining the contents of the
/proc/cpuinfo
pseudo-file:
The simplicity and consistency of this model can, however, come at a cost: Extracting/interrogating specific information from text in pseudo files, and returned from executing commands often requires tools, e.g. sed, awk, perl, python, etc. These tools are used to write commands and scripts to parse the text content, looking for specific patterns, fields, and values. Some of these scripts can get quite complex, are often difficult to maintain, and can be fragile - if the structure, layout, and/or format of the text changes, many scripts will likely need to be updated.
In Windows, Everything is an Object
When Windows NT was being designed & built, "Objects" were seen as the future of software design: "Object Oriented" languages were emerging faster than rabbits from a burrow - Simula and Smalltalk were already established, and C++ was becoming popular. Other Object-Oriented languages like Python, Eiffel, Objective-C, ObjectPascal/Delphi, Java, C#, and many others followed in rapid succession.
Inevitably, having been forged during those heady, Object-Oriented days (circa 1989), Windows NT was designed with a philosophy that "everything is an object". In fact, one of the most important parts of the NT Kernel is the "Object Manager"!
Developers use Windows' Win32 API to access and manipulate objects and structures that provide access to similar information provided by *NIX pseudo files and tools. And because parsers, compilers, and analyzers understand the structure of objects, many coding errors can often be caught earlier, helping verify that the programmer's intent is syntactically and logically correct. This can also result in less breakage, volatility, and "churn" over time.
So, coming back to our central discussion about Windows Console: The NT team decided to build a "Console" which differentiated itself from a traditional *NIX terminal in a couple of key areas:
- Console API: Rather than relying on programmers' ability to generate "difficult to verify" ANSI/VT-sequences, Windows Console can be manipulated and controlled via a rich Console API
- Common services: To avoid having every Command-Line shell re-implement the same services time and again (e.g. Command History, Command Aliasing), the Console itself provides some of these services, accessible via the Console API
Problems with the Windows Console
While the Console's API has proven very popular in the world of Windows Command-Line tools and services, the API-centric model presents some challenges for Command-Line scenarios:
Windows' Command-Line & cross-platform interop
Many Windows Command-Line tools and apps make extensive use of the Console API.
The problem? These APIs only work on Windows. Thus, combined with other differentiating factors (e.g. process lifecycle differences, etc.), Windows Command-Line apps are not always easily-portable to *NIX, and vice-versa.
Because of this, the Windows ecosystem has developed its own, often similar, but usually different Command-Line tools and apps. This means that users have to learn one set of Command-Line apps and tools, shells, scripting languages, etc. when using Windows, and another when using *NIX.
There is no simple quick-fix for this issue: The Windows Console and Command-Line cannot simply be thrown away and replaced by bash and iTerm2 because there are hundreds of millions of apps, scripts, and tools that depend upon the Windows Console and Cmd/PowerShell shells, many of which are launched billions of times a day on Windows PC's and Servers around the globe.
So, what's the solution here? How do developers run command-line tools, compilers, platforms, etc. originally built primarily on/for *NIX based platforms?
3rd party tools like MinGW/MSYS and Cygwin do a great job of porting many of the core GNU tools and compatibility libraries to Windows, but they are not able to run un-ported, unmodified Linux binaries. This turns out to be an essential requirement, becuase many Ruby, Python, Node, etc. packages and modules depend-upon Linux behaviors and/or or "wrap" Linux binaries.
These reasons led Microsoft to enable genuine, unmodified Linux binaries and tools to run natively on Windows' Subsystem for Linux (WSL).
Using WSL, users can now download and install one or more genuine Linux distros side-by-side on the same machine, and use each distros' or tools' package manager (e.g. apt, zypper, npm, gem, etc.) to install and run the vast majority of Linux Command-Line tools, packages, and modules alongside their favorite Windows apps and tools. To learn more about WSL, visit the WSL Learning Page, or the official WSL documentation.
Also, there are still some things that Console offers that haven't been adopted by non-Microsoft terminals: Specifically, the Windows Console provides command-history and command-alias services, which aimed to eliminate the need for every command-line shells (in particular) to re-re-re-implement the same functionality. We'll return to this subject in the future.
Remoting Windows' Command-Line is difficult
As we discussed in the Command-Line Backgrounder post, Terminals were originally separate from the computer to which they were attached. Fast-forward to today, this design remains: Most modern terminals and Command-Line apps/shells/etc. are separated by processes and/or machine boundaries.
On *NIX-based platforms, the notion that terminals and command-line applications are separate and simply exchange characters, has resulted in *NIX Command-Lines being easy to access and operate from a remote computer/device: As long as a terminal and a Command-Line application can exchange streams of characters via a some type of ordered serial communications infrastructure (TTY/PTY/etc.), it is pretty trivial to remotely operate a *NIX machine's Command-Line.
On Windows however, many Command-Line applications depend on calling Console API's, and assume that they're running on the same machine as the Console itself. This makes it difficult to remotely operate Windows Command-Line shells/tools/etc.: How does a Command-Line application running on a remote machine call API's on the user's local machine's Console? And worse, how does the remote Command-Line app call Console API's if its being accessed via a terminal on a Mac or Linux box?!
Sorry to tease, but we'll return to this subject in much more detail in a future post!
Launching the Console … or not!
Generally, on *NIX based systems, when a user wants to launch a Command-Line tool, they first launch a Terminal. The Terminal then starts a default shell, or can be configured to launch a specific app/tool. The Terminal and Command-Line app communicate by exchanging streams of characters via a Pseudo TTY (PTY) until one or both are terminated.
Console confusion
On Windows, however, things work differently: Windows users never launch the Console (conhost.exe) itself: Users launch Command-Line shells and apps, not the Console itself!
SAYWHATNOW?
Yes, in Windows, users launch the Command-Line app, NOT the Console itself. If a user launches a Command-Line app from an existing Command-Line shell, Windows will (usually) attach the newly launched Command-Line .exe to the current Console. Otherwise, Windows will spin up a new Console instance and attach it to the newly launched app.
Because users run
Cmd.exe
or PowerShell.exe
and see a Console window appear, they labor under the common misunderstanding that Cmd and PowerShell are, themselves, "Consoles" ... they're not! Cmd.exe and PowerShell.exe are "headless" Command-Line applications that need to be attached to a Console (conhost.exe
) instance from which they receive user input and to which they emit text output to be displayed to the user.
Also, many people say "Command-Line apps run in the Console". This is misleading and contributes additional confusion about how Consoles and Command-Line apps actually work!
Please help correct this misconception if you hear it by pointing out that "Command-Line tools/apps run connected to a Console" (or similar). Thanks!
Okay, so, Windows Command-Line apps run in their own processes, connected to a Console instance running in a separate process. This is just like in *NIX where Command-Line applications run connected to Terminal apps. Sounds good, right? Well ... no; there are some problems here because Console does things a little differently:
- Console and Command-Line app communicate via IOCTL messages through the driver, not via text streams (as in *NIX)
- Windows mandates that ConHost.exe is the Console app which is connected to Command-Line apps
- Windows controls the creation of the communication "pipes" via which the Console and Command-Line app communicate
These are significant limitations, especially the latter point. Why? What if you wanted to create an alternate Console app for Windows? How would you send keyboard/mouse/pen/etc. user actions to the Command-Line app if you couldn't access the communications "pipes" connecting your new Console to the Command-Line app?
Alas, the story here is not a good one: There ARE some great 3rd party Consoles (and server apps) for Windows (e.g. ConEmu/Cmder, Console2/ConsoleZ, Hyper, Visual Studio Code, OpenSSH, etc.), but they have to jump through extraordinary hoops to act like a normal Console would.
For example, 3rd party Consoles have to launch a Command-Line app off-screen at, for example, (-32000,-32000). They then have to send keystrokes to the off-screen Console, and screen-scrape the off-screen Console's text contents and re-draw them on their own UI! I know, crazy, right?! It's a testament to the ingenuity and determination of the creators of these apps that they even work at all.
This is clearly a situation we are keen to remedy. Stay tuned for more info on this part of the story too - there's some good news on the way.
Windows Console & VT
As discussed above, Windows Console provides a rich API. Using the Console API, Command-Line apps and tools write text, change text colors, move the cursor, etc. And, because of the Console API, Windows Console had little need to support ANSI/VT sequences that provide very similar functionality on other platforms. In fact, until Windows 10, Windows Console only implemented the bare minimum support for ANSI/VT sequences:
[
This all started to change in 2014, when Microsoft formed a new Windows Console team dedicated to untangling and improving the Console & Windows' Command-Line infrastructure.
One of the new Console team's highest priorities was to implement comprehensive support for ANSI/VT sequences in order to render the output of *NIX applications running on Windows Subsystem for Linux (WSL), and on remote *NIX machines. You can read a little more about this story in the previous post in this series.
The Console team added comprehensive support for ANSI/VT sequences to Windows 10's Console, enabling users to use and enjoy a huge array of Windows and Linux Command-Line tools and apps. The team continues to improve and refine Console's VT support with each OS release, and are grateful for any issues you file on our GitHub issues tracker
Handling Unicode
A quick Unicode refresher: Unicode or ISO/IEC 10646 is an international standard defining every character/glyph used in almost every writing system on Earth, plus many non-script symbols and character-sized images (e.g. emoji) in use today. At present (July 2018), Unicode 11 defines 137439 characters, across 146 modern and historic scripts! Unicode also defines several character encodings, including UTF-8, UTF-16, and UTF-32:
- UTF-8: 1-byte for the first 127 code points (maintaining compatibility with ASCII), and an optional additional 1-3 bytes (4 bytes total) for other characters
- UTF-16/UCS-2: 2-bytes for each character. UCS-2 (used internally by Windows) supports encoding the first 65536 code points (know as the Basic Multilingual Plane - BMP). UTF-16 extends UCS-2 by incorporating a 4-byte encoding for 17 additional planes of characters
- UTF-32: 4-bytes per character
The most popular encoding today, thanks to its efficient storage requirements, and widespread use in HTML pages, is UTF-8. UTF-16/UCS-2 are both common, though decreasingly so in stored documents (e.g. web pages, code, etc.). UTF-32 is rarely used due to its inefficient and considerable storage requirements. Great, so we have effective and efficient ways to represent and store Unicode characters!
So?
Alas, the Windows Console and its API were created before Unicode was created. The Windows Console stores text (that is subsequently drawn on the screen) as UCS-2 characters requiring 2-bytes per cell. Command-Line apps write text to the Console using the Console API. Many Console APIs come in two flavors - functions with an
A
suffix handle single-byte/character strings, and functions with a W
suffix handle 2-byte (wchar)/character strings: For example, the WriteConsoleOutputCharacter() function compiles down to WriteConsoleOutputCharacterA()
for ASCII projects, or WriteConsoleOutputCharacterW()
for Unicode projects. Code can specifically call ...A
or ...W
suffixed functions directly if specific handling is required.
However, while all W APIs support UCS-2, and some were updated to also support UTF-16, not all W APIs fully support UTF-16.
Also, Console doesn't support some newer Unicode features including Zero Width Joiners (ZWJ) which are used to combine otherwise separate characters in, for example, Arabic and Indic scripts, and are even used to combine several emoji characters into one visual glyph like the "people" emoji, and ninjacats.
Worse still, the Console's current text renderer can't even draw these complex glyphs, even if the buffer could store them: Console currently uses GDI for text rendering, but GDI doesn't adequately support font-fallback - a mechanism to dynamically find and load an alternative font that contains a glyph missing from the current font. Font-fallback is well supported by more modern text rendering engines like DirectWrite
So what happens if you wanted to write complex and conjoined glyphs onto the Console? Sadly, you can't ... yet, but this too is a post for another time.
So, where are we?
Once again, dear reader, if you've read everything above, thank you, and congratulations - you now know more about the Windows Console than most of your friends, and likely more than even you wanted to! Lucky you
We've covered covered a lot of ground in this post:
- The major building-blocks of the Windows Console:
- Condrv.sys - the Console communication driver
- ConHost.exe - the Console UX, internals, and plumbing:
- API Server - serializes API calls and text data via IOCTL messages send to/from the driver
- API - the functionality of the Console
- Buffers - Input buffer storing user input, output buffer storing output/display text
- VT Parser - converts ANSI/VT sequences embedded in the text stream into API calls
- Console UX - the Console's UI state, settings, features
- Other - Misc lifetime, security, etc.
- What the Console does
- Sends user input to the connected Command-Line app
- Receives and displays output from the connected Command-Line app
- How Console differs from *NIX terminals
- NIX: "Everything is a file/text-stream"
- Windows: "Everything is an object, accessible via an API"
- Console Problems
- Console and Command-Line apps communicate via API call requests and text serialized into IOCTL messages
- Only Windows command-line apps call the Console API
- More work to port Command-Line apps to/from Windows
- Apps call Windows API to interact with Console
- Makes remoting Windows Command-Line apps/tools difficult
- Dependence on IOCTLs breaks the "exchange of characters" terminal design Makes it difficult to operate remote Windows Command-Line tools from non-Windows machines
- Launching Windows Command-Line apps is "unusual"
- Only ConHost.exe can be attached to Command-Line apps
- 3rd party terminals forced to create off-screen Console and send-keys/screen-scrape to/from it
- Windows historically doesn't understand ANSI/VT sequences
- Mostly remedied in Windows 10
- Console has limited support for Unicode & currently struggles to deal with storing and rendering modern UTF-8 and characters requiring Zero Width Joiners
In the next few posts in this series, we'll delve further into the Console, and discuss how we're addressing these issues ... and more! As always, stay tuned [Many thanks to my colleagues on the Console team for helping keep this post accurate and balanced - Michael, Mike, Dustin and Austin - y'all rock! ]
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
- Bloggery committed by chris tower - 1807.24 - 10:10
- Days ago = 1116 days ago
- New note - On 1807.06, I ceased daily transmission of my Hey Mom feature after three years of daily conversations. I plan to continue Hey Mom posts at least twice per week but will continue to post the days since ("Days Ago") count on my blog each day. The blog entry numbering in the title has changed to reflect total Sense of Doubt posts since I began the blog on 0705.04, which include Hey Mom posts, Daily Bowie posts, and Sense of Doubt posts. Hey Mom posts will still be numbered sequentially. New Hey Mom posts will use the same format as all the other Hey Mom posts; all other posts will feature this format seen here.
No comments:
Post a Comment