Powershell to Clear a Huge SharePoint List

Sometimes you end up with lists full of millions of list items in SharePoint, and there is no straightforward way of emptying them. That’s where Powershell comes in.

The example below shows you how you can setup a kind of cursor to step through the deletes. The actual example is for a Nintex Workflow History list – these can get huge if you don’t stay on top of them (depending on how much debug data workflow designers throw at them).

$site_collection_url = "https://foo.com"
$batch_size = 1000

$site = get-spsite $site_collection_url
$web = get-spweb $site_collection_url
$list = $web.Lists["NintexWorkflowHistory"]
$query = New-Object Microsoft.SharePoint.SPQuery
$query.ViewAttributes = "Scope='Recursive'";
$query.RowLimit = $batch_size
$query.Query = '<OrderBy Override="TRUE"><FieldRef Name="ID" /></OrderBy>'
$process_count = 0
do {
  $start_time = Get-Date
  write-host $(" - [Compiling Batch (" + $batch_size + " items)]") -nonewline
  $listItems = $list.GetItems($spQuery)
  $count = $listItems.Count
  $query.ListItemCollectionPosition = $listItems.ListItemCollectionPosition
  $batch = "<?xml version=`"1.0`" encoding=`"UTF-8`"?><Batch>"
  $j = 0
  for ($j = 0; $j -lt $count; $j++)
  {
    $item = $listItems[$j]
    $batch += "<Method><SetList Scope=`"Request`">$($list.ID)</SetList><SetVar Name=`"ID`">$($item.ID)</SetVar><SetVar Name=`"Cmd`">Delete</SetVar><SetVar Name=`"owsfileref`">$($item.File.ServerRelativeUrl)</SetVar></Method>"
    if ($i -ge $count) { break }
  }
  $batch += "</Batch>"
  write-host " [Sending Batch]" -nonewline
  $result = $web.ProcessBatchData($batch)

  write-host " [Emptying Web Recycle Bin]" -nonewline
  $web.RecycleBin.DeleteAll()
  
  write-host " [Emptying Site Recycle Bin]" -nonewline
  $site.RecycleBin.DeleteAll()

  $end_time = Get-Date
  $process_count += $batch_size
  write-host $(" [Processing Time " + ($end_time - $start_time).TotalSeconds + "] [Processed " + $process_count + " so far]") -nonewline
  write-host " [Waiting 2 seconds]"
  start-sleep -s 2

} while ($spQuery.ListItemCollectionPosition -ne $null)

$web.Dispose()

The key to the whole thing is a little known property of the query called ListItemCollectionPosition, that acts like a cursor in a database.

You will also notice that the script empties the bin, and the site collection recycle bin every time it iterates. You might want to keep an eye on the Database transaction logs if you’re removing millions of records too.

How SharePoint Chooses Folder Content Types

When you create a document library, the document library (by default) has two content types – Document, and Folder. Folder is hidden from you – but when you list the content types of the library in Powershell, the following will be reported:

  • Document
  • Folder

When you create a folder in WebDAV, the Folder Content Type is chosen by SharePoint automatically. Interestingly, if you use powershell to remove the Folder content type, and then create another folder via WebDAV, it will have NO content type listed, even though the folder will still work as a folder in the library.

Here’s the catch – and how you can confuse SharePoint…

If you remove the Folder Content Type, and add a Document Set Content Type, and then go to WebDAV to create a folder, the folder will get the Document Set content type. The reason for this appears to be in the way SharePoint chooses the folder content type – Folders and Document Set IDs (GUIDs) both begin 0x0120. It looks like SharePoint finds the document set content type, and uses it for the folder. It even does it if you add the folder content type back, which points towards SharePoint interrogating the content types in the order they were applied to the library for the first one where the GUID begins 0x0120.

You would think you would be able to re-write the UniqueContentTypeOrder for the RootFolder of the library to correct things, and you would be wrong. The only reliable method I have found to correct libraries where this has happened is to re-construct the entire library.

Enabling Features in SharePoint MySites

One of the more common administration tasks you might get thrown is enabling a given feature within all the MySites throughout an on-premesis SharePoint farm.

The first thing the script will need to do is set out a few ArrayLists to help us manage things – a list of users, a list of site (site collection) feature names, and a list of web (site) feature names.

$account_names = New-Object System.Collections.ArrayList
$account_names.Add("some_username") > $null

$site_feature_names = New-Object System.Collections.ArrayList
$site_feature_names.Add("SPPCopyMoveSite") > $null

$web_feature_names = New-Object System.Collections.ArrayList

The next thing we will need to do is load the SharePoint snap-in…

Add-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue

And then load our array with the users we want to process (the $mysite_host_site_url can be any URL in reality – it’s just used to grab the profile manager).

$mysite_host_site = Get-SPSite $mysite_host_site_url
$context = Get-SPServiceContext -Site $mysite_host_site
$profileManager = New-Object Microsoft.Office.Server.UserProfiles.UserProfileManager($context)

Next we need to load the profiles we want to work on into a list we can easily iterate through (mainly to make the rest of the code look nice).

$profiles = New-Object System.Collections.ArrayList
foreach($account_name in $account_names)
{
    $profile = $profileManager.GetUserProfile($account_name)
    $profiles.Add($profile) > $null
}

And finally, we loop through each profile, look at their personal site, and then through the list of features we want to switch on – and switch them on. We also check if they exist, and if they are already enabled.

foreach ($profile in $profiles)
{
    Write-Host $("Processing [" + $profile.AccountName + "]")
       
    if ($profile.PersonalSite)
    {
        Write-Host $(" - Site URL [" + $profile.PersonalUrl + "]")
        $site = get-spsite $profile.PersonalUrl
        $web = $site.RootWeb

        if ($site_feature_names.length -gt 0) {
            Write-Host $(" - Processing Site Features")
            foreach ($site_feature_name in $site_feature_names)
            {
                Write-Host $(" - Processing [" + $site_feature_name + "]")
            
                $feature = Get-SPFeature -limit ALL -identity $site_feature_name -ErrorAction SilentlyContinue

                if ($feature)
                {
                
                    $site_feature = Get-SPFeature -limit ALL -identity $site_feature_name -site $site.Url -ErrorAction SilentlyContinue
                    if ($site_feature -eq $null)
                    {
                        Write-Host "  - Enabling Feature"
                        Enable-SPFeature -identity $feature.Id -url $site.Url
                        Write-Host "  - Feature Enabled"
                    } else {
                        Write-Host "  - Feature Already Enabled"
                    }
                }
                else
                {
                    Write-Host "  - Feature Not Found"
                }
            }
            Write-Host ""
        }


        if ($web_feature_names.length -gt 0) {
            Write-Host $(" - Processing Web Features")
            foreach ($web_feature_name in $web_feature_names)
            {
                Write-Host $(" - Processing [" + $web_feature_name + "]")
            
                $feature = Get-SPFeature -limit ALL -identity $web_feature_name -ErrorAction SilentlyContinue

                if ($feature)
                {
                
                    $web_feature = Get-SPFeature -limit ALL -identity $web_feature_name -web $web.Url -ErrorAction SilentlyContinue
                    if ($web_feature -eq $null)
                    {
                        Write-Host "  - Enabling Feature"
                        Enable-SPFeature -identity $feature.Id -url $site.Url
                        Write-Host "  - Feature Enabled"
                    } else {
                        Write-Host "  - Feature Already Enabled"
                    }
                }
                else
                {
                    Write-Host "  - Feature Not Found"
                }
            }
        }
        
        Write-Host " - Releasing Profile Site Resources"
        $web.Dispose()
        $site.Dispose()
        Write-Host ""
    }
    else
    {
        Write-Host $(" - Profile has no Site")
    }
    
    Write-Host ""

}

Write-Host "Releasing MySite Host Resources"
$mysite_host_site.Dispose()

The same process could obviously be modified to work with any collection of sites really – not just MySites. Feel free to borrow and change.

Bulk Provisioning of SharePoint MySites with Powershell

In the world of on-premesis SharePoint systems, MySites are provisioned on an “on demand” basis for users – when the user first visits their “my site”, the site is provisioned for them from the stock template.

You might want to make some changes to the MySite before the user gets access to it though – to configure the Document library, for example – therefore you want the site to get provisioned in advance, to give yourself the chance to modify it.

First of all, we will need to load the SharePoint snap-in

Add-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue

Next we need to get the users we want to deal with – we can do this by looping through the user profiles – it’s worth noting that the mysite_host_site_url variable is typically the root site that MySites live within – but it doesn’t have to be.

$mysite_host_site = Get-SPSite $mysite_host_site_url
$context = Get-SPServiceContext -Site $mysite_host_site
$profileManager = New-Object Microsoft.Office.Server.UserProfiles.UserProfileManager($context)
$profiles = $profileManager.GetEnumerator()

Next we need to loop through the users, and provision their personal sites if they have no already been provisioned.

foreach($profile in $profiles)
{
    Write-Host $("Processing [" + $profile.AccountName + "]")
    if ($profile.PersonalSite) {
        Write-Host $(" - has personal site")
    } else {
        Write-Host $(" - Creating Personal Site")
        $profile.CreatePersonalSite()
        Write-Host $(" - Done [" + $profile.PersonalUrl + "]")
    }
}

There is an important point here – the script MUST be run as the service user that usually provisions personal sites (typically the app pool account). It makes sense if you think about it – the sites are being created by proxy. If you try to run it as anybody else, you will see permissions errors.

Another note – you can’t create personal sites through normal means (i.e. create stock sites to match those created by a call to CreatePersonalSite – because the code in the CreatePersonalSite method also stitches up all the connections between the Profile, and the Site. You will be able to create the sites, but the various properties and methods of the profile will be incorrect.

Oh – and finally, of course we will need to release resources to prevent Powershell from eating the server’s memory:

$mysite_host_site.Close() > $null
$mysite_host_site.Dispose() > $null

Iterating through Users with Powershell

One of the more common things you might need to do in a Powershell script is iterate through the users within a system. A great way of doing this is via an LDAP lookup to Active Directory. Rather than use the collection of AD users directly, it makes sense to compile them into an ArrayList first – which then gives you an enumerable collection that’s quick to check the contents of, and easy to loop through.

Here’s an example:

$account_names = New-Object System.Collections.ArrayList
$ldap_path = &quot;OU=foo,DC=widgetcorp&quot;
$ad_users = Get-ADUser -SearchBase $ldap_path -filter { (objectcategory -eq 'person') -And (objectclass -eq 'user') }
$account_names = New-Object System.Collections.ArrayList
foreach($ad_user in $ad_users){
    $account_names.add($ad_user.SamAccountName) &gt; $null
}

Notice that the call to Get-ADUser has a filter on it to restrict the search to (hopefully) only humans – which should remove system accounts.

Once you have the collection, it’s easy to use it for other things:

foreach ($account_name in $account_names)
{
  # do something with $account_name
}

Reading CSV files with Powershell

Powershell has a wonderful commandlet that sucks a CSV file into memory in one go (import-csv), but the contents and methods required to read the CSV once you have loaded it are a bit odd.

Reading the Contents of the CSV

This is the easy bit – and owing to Microsoft’s lamentable documentation, there really isn’t much information about how it does what it does… (which goes for a lot of Microsoft Powershell commandlets, to be honest.

$records = import-csv "c:\filename.csv" 
foreach ($record in $records) {
  write-host $field_value = $record.Fieldname `n
}

This is all very clever, but you can see immediately – the problem we have is the record objects that the commandlet creates are pre-populated with properties – and hard-coding isn’t always the best way to go. So how do we get the field names?

Reading the Column Headers

This is how you read the fieldnames from the CSV – so you might then be able to use them to read the contents of the CSV later:

$csv = import-csv "c:\temp\test.csv"
$headers = $csv | Get-Member -MemberType NoteProperty | foreach { $_.name }

So what’s going on here? It turns out the CSV object compiled by import-csv has an object called “NoteProperty”, that contains a collection of objects – each of which have a name property – that corresponds with the name of the heading of a column in the CSV file.

Who Invented What ?

While listening to a podcast on the way to work a little while ago, the host made the classic mistake that nearly everybody makes, in naming Tim Berners Lee as the inventor of “The Internet”. It was just a comment in passing (they were talking about Ray Tomlinson, who had passed away), but it got me thinking. I’ve read all manner of books about the history of the internet, and of computers in general – so perhaps know a little bit more than most.

Let’s go through each subject in turn…

Who invented the Internet

Before we answer that one, it’s worth figuring out what we really mean by “the internet”. If we mean the network of geographically dispersed computers transmitting and receiving information between each other – then that was almost certainly invented by Bob Kahn, and Vint Cerf. They invented TCP (Transmission Control Protocol) – a means by which machines could encode and decode packets of information to send between each other.

If we mean the concept of connecting many computers together, such that they might talk to each other (but wired directly to each other), then J.C.R. Licklider is the inventor. He was in charge of the project that connected military computers together into a network that became known as ARPANET in the 1960s.

Who invented E-Mail

Almost certainly NOT Ray Tomlinson (who is often cited). Ray is credited with establishing the use of the @ character to delimit the username, and network they were a member of. He is also credited with the first application to send and receive messages across a wide area network (the Arpanet). Sending messages to each other via computer terminals on the same system pre-dates Ray Tomlinson’s work by many years. Tom Van Vleck and Noel Morris invented the first “Mail” application (as we now think of mail) in 1965 at MIT. Technically even they were not the first though – similar applications existed on other computer systems, but were not thought of as “mail” yet.

Who invented the World Wide Web

This one is much more simple to answer – the web was invented by Tim Berners Lee while working at CERN, as a means the scientific community might use to share information between each other. The project he worked on developed the first markup language to carry pages of information (HTML), applications to view the interpreted information (Browsers), and applications to serve those pages (Web Servers).

Who Invented the Mouse?

Although Douglas Englebart is widely credited with inventing the mouse, a german engineer called Rainer Mallebrein may have actually got there first. Ironically, the trackball had existed for many years in various guises, but nobody had considered reversing it until Mallebrein and Englebart came along.

Who Invented Public Key Cryptography

This is a pretty funny one. The mathematical method of encrypting data known as RSA is named after the three American computer scientists that came up with it – Ron Rivest, Adi Shamir, and Leonard Ableman. They are actually pre-dated by at least 5 years by Clifford Cocks, who worked at GCHQ in the UK – only his work was classified. The method computers use to exchange keys is called “Diffie Hellman” key exchange, after the American computer scientists that came up with it – Whifield Diffie, and Martin Hellman. Again, it is pre-dated by the same invention by Malcom Williamson at GCHQ in the UK – and again, his work was classified.